Class Info and Policies
Classes and Job Scheduling
All Loadleveler jobs must be submitted to a valid submit class. If
the class doesn't exist, no error message will be issued. The job will
be submitted, but will sit in the queue indefinitely. If this happens to you,
you must delete
the job using the llcancel
command and resubmit to an available
class.
NERSC users specify one of the following submit classes to queue jobs. Upon
submission the job is routed to the appropriate
LoadLeveler class according
to the following criteria. (Users can not directly access the LoadLeveler
classes.)
| Submit Class1 |
Job Type |
Destination Class2 |
Nodes |
Available Processors |
Max Wallclock |
Relative Priority3 |
MPP Charge (units of
nodes*wall hrs) |
Availability |
| interactive |
parallel |
interactive |
1-4 |
1-32 |
30 mins |
1 |
48 |
Everyone |
| debug4 |
parallel |
debug |
1-8 |
1-64 |
30 mins |
2 |
48 |
Everyone |
| premium5 |
parallel |
premium |
1-48 |
1-384 |
12 hrs |
4 |
96 |
Everyone |
| regular |
parallel |
reg_1 |
1-15 |
1-120 |
36 hrs |
5 |
48 |
Everyone |
| parallel |
reg_16 |
16-31 |
121-248 |
18 hrs |
5 |
48 |
Everyone |
| parallel |
reg_32 |
32-48 |
249-384 |
18 hrs |
5 |
48 |
Everyone |
| low |
parallel |
low |
1-32 |
1-256 |
12 hrs |
6 |
24 |
Everyone |
| special6 |
parallel |
special |
1-64 |
1-512 |
48 hrs |
3 |
48 |
By special arrangement |
| full_config6 |
parallel |
full_config |
1-ALL |
1-ALL |
48 hrs |
3 |
48 |
By special arrangement |
Notes
1 - This is the class name to be used in LoadLeveler scripts.
2 - Users cannot submit scripts directly to a destination class,
but this is the class name that will appear when using job monitoring utilities.
3 - The priorites listed in the table are relative. NERSC assigns priorities in terms of
"equivalent days waiting in the queue".
In addition to the relative priority given to jobs depending on their LoadLeveler class, certain projects
with high priority within DOE receive a "scheduling boost". These tend to be INCITE projects.
4 - 4 nodes are reserved exclusively for interactive and debug use weekdays from 5:00 to 18:00
Pacific Time.
5 - The intent of the premium queue is to
allow for faster turnaround before conferences and urgent project deadlines.
It should be used with care, and in most cases a project should not spend more than 10 percent of its
time in premium.
6 - Available by special arrangement only.
See also Queue Policies, below, for information on run limits.
You can use the llclass command on the system to obtain information
about the LoadLeveler classes. Detailed information about a single LoadLeveler
class can be found using llclass -l classname.
If you request more wall clock time than allowed by the class (as indicated by
the Max Wallclock column in the table above), your job
will be submitted with the wall clock time adjusted to the maximum allowed.
If you omit requested time, then a default of 30 minutes will be used.
Your job will be charged and scheduled according to the priority listed in the
class name. Both interactive and debug are charged at the regular rate.
See MPP Accounting.
The classes are configured to give the best service to premium and
regular jobs.
Premium jobs are charged
at twice the rate of regular jobs, but are scheduled at a higher priority.
Loadleveler uses a scheduling technique called "backfilling". This
method starts smaller, shorter jobs if they will not affect the start time for
the job that is scheduled to begin next. This scheduling technique is
advantageous from both a user and system perspective. It allows a faster turn
around for shorter jobs, and it maximizes system usage.
NERSC Queue Policies for Bassi
- For the production batch classes, each user may have:
- 3 jobs running (this parameter can be adjusted depending on system load).
- 4 jobs in Idle state (jobs queued to run; this parameter can be adjusted
depending on system load).
If you have 4 jobs queued (in Idle state) and need to run an Interactive
or Debug job, place
one of your jobs on User Hold: llhold jobid. To requeue the job:
llhold -r jobid.
- The combined number of debug and interactive jobs that a user may
have submitted or running at a given time must be two or fewer.
Note that this policy only applies to jobs run in the interactive
and debug batch classes. This includes parallel jobs (anything
compiled with one of the "mp" compilers, e.g. mpxlf90) that are
executed from the command line, as well as those jobs (parallel or
otherwise) that are explicitly submitted to these two classes with
the "llsubmit" command. The policy has NO effect on sequential
programs executed from the command line, including all the normal
Unix commands.
- The interactive and debug classes are to be used for code development, testing, and debugging. Production runs are strictly prohibited from using the interactive and debug classes. User accounts are subject to suspension if they are determined to be using the interactive or debug class for production computing.
- Any job that has been in the queue for 7 days or more, and is in the "user
hold" (Loadlever status HU) state, will be removed from the system. Note that
this means:
- Jobs may not be held for more than 7 days; and
- Jobs older than 7 days may not be held.
-
Since jobs on User Hold age in the queue, their release may
perturb the scheduler such that overall system throughput is degraded. In such
circustances NERSC may change the state of User Hold jobs to System
Hold, and release them only when overall system throughput will not
be affected.
- A 60 minute time limit is enforced on all user processes
on the login nodes.
- Job "chaining" in the debug and interactive classes is strictly
forbidden. Chaining is defined as using a batch script to submit another
batch script. User accounts are subject to suspension if they
are found to be chaining jobs in the debug or interactive classes.
-
Bassi is occassionally removed from service for maintenance.
Users will be given seven days notice before such events, usually on
the "Message of the Day" (MOTD), which is displayed upon login and is also
available here.
Usually, a system reservation will be made so that all jobs will finish normally before a maintence period; however, jobs that are running - for any reason - may
be terminated at the start of a maintence period.
|