/[pdpsoft]/nl.nikhef.pdp.dynsched-pbs-plugin/branches/RB-2.3.0/RELEASE
ViewVC logotype

Contents of /nl.nikhef.pdp.dynsched-pbs-plugin/branches/RB-2.3.0/RELEASE

Parent Directory Parent Directory | Revision Log Revision Log


Revision 2310 - (show annotations) (download)
Thu Jun 23 14:16:28 2011 UTC (11 years, 1 month ago) by templon
File size: 6512 byte(s)
Update documentation and version number

1 This file contains release notes and a change history for the
2 pbs/torque and maui plugins for the lcg-info-dynamic-scheduler information provider.
3 The notes are most recent first.
4
5 Release 2.3.0
6
7 Changed RPM name. Basically to emphasize that this thing is a plugin for the dynamic
8 scheduler program, and not a dynamic scheduler itself.
9
10 Release 2.2.2
11
12 A bug was fixed in parsing the output of "qstat -f"; the error condition had
13 not been previously reported, and was due to job staging errors being reported
14 via the "sched_hint" attribute. Empty lines in this error output broke the
15 parsing algorithm.
16
17 The lrmsinfo-pbs previously had the location of the python modules determined at
18 build time and embedded in the program. This has been removed as in EMI, everything
19 will go in system default locations (see also below about Makefile changes). For
20 local (non RPM) installs, you'll need to have PYTHONPATH set accordingly.
21
22 A testing framework was introduced (available in the svn, not in the RPM) with right now
23 a single test defined; which is whether the above situation can be correctly parsed.
24
25 The Makefile was changed in order to have the python library path be compliant with system
26 defaults.
27
28 Release 2.2.1
29
30 Dependencies changed back to the original set, so that this version's RPM
31 dependencies are compatible with those of previous versions (and also
32 with the lcg-info-dynamic-scheduler-generic package).
33
34 Release 2.2.0
35 Moved to new repository format and changed RPM name. Rather large changes in two areas. First, in the logFileParser
36 part of the pbsServer.py code. The new "account" field printed in the torque accounting logs broke the old parser,
37 this is fixed in the current release. The second set of changes have to do with how the LiveServer code parses
38 the 'qstat -f' output. When present (recent versions of torque), the code now uses 'startTime' as the start time
39 of the job, and uses the 'euser' and 'egroup' as the user and group of the job. If these fields are not present,
40 the code reverts to the old behavior. Using startTime will give slightly different results, since startTime
41 is present immediately upon job start, whereas walltime (used in the old calculation) is only updated once per
42 mom update interval, meaning there were always a few jobs in the system that were in state "running" but that had
43 no startTime. This will no longer be the case.
44
45 Three user-visible changes: there are two new job attributes in the "Job" object. One is the walltime used by the job,
46 the other is the 'startAnchor' field which tells you whether the code has used the 'startTime' to find the start time,
47 or has done it the old way by subtracting the walltime used from the current time. These data are both printed
48 by lrmsinfo-pbs. Those two changes are in principle user-visible, but only for those users directly using lrmsinfo-pbs.
49 Finally, vomaxjobs-maui accepts an argument '-k' to provide a key-file to the diagnose command. This is needed for
50 e.g. installations where the diagnose client cmd is on RHEL4 and the maui server is on RHEL5 (different build = different key).
51
52 Information below this point refers to older versions of the package, under the older naming
53 scheme.
54
55 Release 2.1.0
56 - Remove support for specifying hostname on command line (savannah bug 35662)
57
58 Release 2.0.1
59 - fix LICENSE, otherwise no change.
60
61 Release 2.0.0
62 - uses new lrms.py, otherwise no changes.
63
64 Release 1.6.1
65 - update documentation and examples to be consistent with 2.0 release
66
67 Release 1.6.0
68
69 - some changes to build system (three targets increases aggravation)
70 - some changes to pbsServer classes to assist in debugging.
71 - some changes to vomaxjobs-maui to assist in debugging/testing;
72 also fixed various unreported bugs discovered during testing.
73 - Change mapping of pbs/torque job states in pbs classes; up til now
74 was either queued (Q) or running (any other states). Now we have:
75
76 From the qstat (torque 2.0.0p4) man page:
77
78 C - Job is completed after having run (mapped to 'done')
79 E - Job is exiting after having run. (mapped to 'running')
80 H - Job is held. (mapped to 'pending')
81 Q - job is queued, eligible to run or routed. (mapped to 'queued')
82 R - job is running. (mapped to 'running')
83 T - job is being moved to new location. (mapped to 'pending')
84 W - job is waiting for its execution time (mapped to 'queued')
85
86 Release 1.5.2:
87
88 pbs package: Fix to vomaxjobs-maui to deal with cases where there is
89 extra 'warning' output near the top of the command output from diagnose -g.
90
91 Release 1.5.1:
92
93 fix dependency problems with RPMs.
94
95 Release 1.5.0
96
97 in vomaxjobs-maui, adapt to handle MAXPROC specifications like
98 MAXPROC=soft,hard The code reports the 'hard' limit, since
99 this is relevant when the system is not full, and this is when
100 it's needed. Maui uses the soft limit on a full system, but
101 in this case the info provider will drop FreeSlots to zero as
102 soon as jobs remain in the queued state instead of executing
103 immediately.
104
105 Release 1.4.2
106
107 in pbsServer.py: included Steve Traylen's patch to deal with jobs for which the
108 uid/gid printed by 'qstat' is not listed in the in running machine's
109 pw DB. This can happen when the CE is not the same physical
110 machine as the actual LRMS server.
111
112 Estimated Response Time Info Providers (v 1.4.1)
113 ------------------------------------------------
114
115 This information provider is new in LCG 2.7.0 and is
116 contained in two RPMs, lcg-info-dynamic-scheduler-generic
117 and lcg-info-dynamic-scheduler-pbs. Sites using torque/pbs
118 as an LRMS and Maui as a scheduler are fully supported by
119 this configuration; those using other schedulers and/or
120 LRMS systems will need to provide the appropriate back-end
121 plugins.
122
123 For sites meeting the following criteria, the system should
124 work out of the box with no modifications whatsoever:
125
126 LRMS == torque
127 scheduler == maui
128 vo names == unix group names of that vo's pool accounts
129
130 Documentation on what to do if this is not the case can be
131 found in the file
132
133 lcg-info-dynamic-scheduler.txt
134
135 in the doc directory
136
137 /opt/lcg/share/doc/lcg-info-dynamic-scheduler
138
139 There is also documentation in this directory indicating
140 the requirements on the backend commands you will need to
141 provide in the case that you are using a different
142 scheduler or LRMS. Tim Bell at CERN can help for people
143 using LSF.

Properties

Name Value
svn:keywords Id URL

grid.support@nikhef.nl
ViewVC Help
Powered by ViewVC 1.1.28