ViewVC logotype

Contents of /nl.nikhef.pdp.dynsched-pbs-plugin/branches/RB-2.2.2/RELEASE

Parent Directory Parent Directory | Revision Log Revision Log

Revision 2257 - (show annotations) (download)
Tue Apr 5 12:32:28 2011 UTC (11 years, 4 months ago) by templon
File size: 6014 byte(s)
- removed compile-time path from lrmsinfo-pbs; system path will be used
- for in-place installs, use PYTHONPATH on the command line
- document changes in RELEASE file
- changes in Makefile to have python libs installed in standard places

1 This file contains release notes and a change history for the
2 pbs/torque and maui plugins for the lcg-info-dynamic-scheduler information provider.
3 The notes are most recent first.
5 Release 2.2.2
7 A bug was fixed in parsing the output of "qstat -f"; the error condition had
8 not been previously reported, and was due to job staging errors being reported
9 via the "sched_hint" attribute. Empty lines in this error output broke the
10 parsing algorithm.
12 A testing framework was introduced (available in the svn, not in the RPM) with right now
13 a single test defined; which is whether the above situation can be correctly parsed.
15 The Makefile was changed in order to have the python library path be compliant with system
16 defaults.
18 Release 2.2.1
20 Dependencies changed back to the original set, so that this version's RPM
21 dependencies are compatible with those of previous versions (and also
22 with the lcg-info-dynamic-scheduler-generic package).
24 Release 2.2.0
25 Moved to new repository format and changed RPM name. Rather large changes in two areas. First, in the logFileParser
26 part of the pbsServer.py code. The new "account" field printed in the torque accounting logs broke the old parser,
27 this is fixed in the current release. The second set of changes have to do with how the LiveServer code parses
28 the 'qstat -f' output. When present (recent versions of torque), the code now uses 'startTime' as the start time
29 of the job, and uses the 'euser' and 'egroup' as the user and group of the job. If these fields are not present,
30 the code reverts to the old behavior. Using startTime will give slightly different results, since startTime
31 is present immediately upon job start, whereas walltime (used in the old calculation) is only updated once per
32 mom update interval, meaning there were always a few jobs in the system that were in state "running" but that had
33 no startTime. This will no longer be the case.
35 Three user-visible changes: there are two new job attributes in the "Job" object. One is the walltime used by the job,
36 the other is the 'startAnchor' field which tells you whether the code has used the 'startTime' to find the start time,
37 or has done it the old way by subtracting the walltime used from the current time. These data are both printed
38 by lrmsinfo-pbs. Those two changes are in principle user-visible, but only for those users directly using lrmsinfo-pbs.
39 Finally, vomaxjobs-maui accepts an argument '-k' to provide a key-file to the diagnose command. This is needed for
40 e.g. installations where the diagnose client cmd is on RHEL4 and the maui server is on RHEL5 (different build = different key).
42 Information below this point refers to older versions of the package, under the older naming
43 scheme.
45 Release 2.1.0
46 - Remove support for specifying hostname on command line (savannah bug 35662)
48 Release 2.0.1
49 - fix LICENSE, otherwise no change.
51 Release 2.0.0
52 - uses new lrms.py, otherwise no changes.
54 Release 1.6.1
55 - update documentation and examples to be consistent with 2.0 release
57 Release 1.6.0
59 - some changes to build system (three targets increases aggravation)
60 - some changes to pbsServer classes to assist in debugging.
61 - some changes to vomaxjobs-maui to assist in debugging/testing;
62 also fixed various unreported bugs discovered during testing.
63 - Change mapping of pbs/torque job states in pbs classes; up til now
64 was either queued (Q) or running (any other states). Now we have:
66 From the qstat (torque 2.0.0p4) man page:
68 C - Job is completed after having run (mapped to 'done')
69 E - Job is exiting after having run. (mapped to 'running')
70 H - Job is held. (mapped to 'pending')
71 Q - job is queued, eligible to run or routed. (mapped to 'queued')
72 R - job is running. (mapped to 'running')
73 T - job is being moved to new location. (mapped to 'pending')
74 W - job is waiting for its execution time (mapped to 'queued')
76 Release 1.5.2:
78 pbs package: Fix to vomaxjobs-maui to deal with cases where there is
79 extra 'warning' output near the top of the command output from diagnose -g.
81 Release 1.5.1:
83 fix dependency problems with RPMs.
85 Release 1.5.0
87 in vomaxjobs-maui, adapt to handle MAXPROC specifications like
88 MAXPROC=soft,hard The code reports the 'hard' limit, since
89 this is relevant when the system is not full, and this is when
90 it's needed. Maui uses the soft limit on a full system, but
91 in this case the info provider will drop FreeSlots to zero as
92 soon as jobs remain in the queued state instead of executing
93 immediately.
95 Release 1.4.2
97 in pbsServer.py: included Steve Traylen's patch to deal with jobs for which the
98 uid/gid printed by 'qstat' is not listed in the in running machine's
99 pw DB. This can happen when the CE is not the same physical
100 machine as the actual LRMS server.
102 Estimated Response Time Info Providers (v 1.4.1)
103 ------------------------------------------------
105 This information provider is new in LCG 2.7.0 and is
106 contained in two RPMs, lcg-info-dynamic-scheduler-generic
107 and lcg-info-dynamic-scheduler-pbs. Sites using torque/pbs
108 as an LRMS and Maui as a scheduler are fully supported by
109 this configuration; those using other schedulers and/or
110 LRMS systems will need to provide the appropriate back-end
111 plugins.
113 For sites meeting the following criteria, the system should
114 work out of the box with no modifications whatsoever:
116 LRMS == torque
117 scheduler == maui
118 vo names == unix group names of that vo's pool accounts
120 Documentation on what to do if this is not the case can be
121 found in the file
123 lcg-info-dynamic-scheduler.txt
125 in the doc directory
127 /opt/lcg/share/doc/lcg-info-dynamic-scheduler
129 There is also documentation in this directory indicating
130 the requirements on the backend commands you will need to
131 provide in the case that you are using a different
132 scheduler or LRMS. Tim Bell at CERN can help for people
133 using LSF.


Name Value
svn:keywords Id URL

ViewVC Help
Powered by ViewVC 1.1.28