NERSC logo National Energy Research Scientific Computing Center
  A DOE Office of Science User Facility
  at Lawrence Berkeley National Laboratory
 

Running Jobs on Davinci

Davinci is primarily intended for interactive tasks encompassing visualization, data mining and data analysis. Examples are:

  • Graphical or interactive visualization applications such as AVS, EnSight, and VisIt.
  • Interactive command line data analysis applications like Matlab, Mathematica, Maple, and IDL.
  • Users' own visualization and data analysis programs.

Interactive Usage Policy

Due to the dynamic and unpredictable nature of visualization and data analysis, NERSC will attempt to provide equitable access to Davinci's resources through the enforcement of certain usage guidelines. The most crucial resources are processors (32 CPUs) and memory (192 GB).

Processor Usage Policy

No single interactive job should use more than 12 processors. A single interactive job should use no more than 12 processor-hours. This might be spread over multiple processors. For example, 12 processor-hours could be:

  • a single job using 1 processor for 12 hours;
  • a single job using 2 processors for 6 hours;
  • a single job using 4 processors for 3 hours ; or
  • a single job using 12 processors for 1 hour.

Memory Usage Policy

No single interactive job should use more than 80 GB of physical memory.

Exceptions to Interactive Policy

It is not the purpose of these policies to prohibit interactive jobs that fall outside of the above guidelines. NERSC will attempt to accommodate requests for interactive resources greater than what is described above. Users who think they have a legitimate need for larger interactive resources should contact NERSC User Services before attempting to run such interactive jobs.

Running Batch Jobs

While the primary role of Davinci is to provide interactive resources as described above, users can also submit batch jobs. The batch software is PBS Pro.

Do not set group write permission for your home directory. If you do this, PBS cannot run your jobs.

Batch scripts

A batch script - a text file with PBS directives and job commands - is required to submit jobs. PBS directive lines, which tell the batch system how to run your job, begin with #PBS. A minimal script to run a shared-memory application that needs to run for 4 hours using 8 CPUs and 48 GB of memory will be very similar to the following example:

#PBS -l mem=48gb,ncpus=8,walltime=04:00:00
#PBS -N myjob
#PBS -o myjob.out
#PBS -e myjob.err
#PBS -A repo
#PBS -q batch
#PBS -V

cd $PBS_O_WORKDIR
./a.out

In the above example, repo is to be replaced by the repository you want to charge the job against.

NOTE: On larger NERSC systems such as Franklin, processors are allocated on a per-node basis; users specify the desired number of nodes in their batch scripts. On Davinci, processors are allocated on a per-cpu basis; batch scripts specify the number desired processors (ncpus=n). Davinci is not a multi-node system. Please do not specify number of nodes in Davinci batch scripts.

Jobs that read or write large files should be executed in the $SCRATCH file system. In the sample script above, the line cd $PBS_O_WORKDIR changes the current working directory to the directory from which the script was submitted. The easiest way to run a job using $SCRATCH is to submit the job from a $SCRATCH directory. You may also cd to your $SCRATCH directory in place of cd $PBS_O_WORKDIR.

Common PBS Options/Directives
OptionDefaultDescription
-A repo Your default repo Charge this job to repo
-e filename <script_name>.e<job_id> Write STDERR to filename
-o filename <script_name>.o<job_id> Write STDOUT to filename
-j [eo|oe] Do not merge. Merge STDOUT and STDERR. If eo merge as standard output; if oe merge as standard error.
-m [a|b|e|n] a E-mail notification options:
a = send mail when job aborted by system
b = send mail when job begins
e = send mail when job ends
n = do not send mail
Options a,b,e may be combined.
-N job_name Job script name. Job Name: up to 15 printable, non-whitespace characters.
-q queue batch See Batch queues below.
-S shell Login shell Specify shell as the scripting language to use.
-V Do not import. Export the current environment variables into the batch job environment.

All options may be specified as either (1) qsub command-line options or (2) as directives in the batch script as #PBS option.

Account (repo) charging

Jobs are charged against your default repository unless otherwise specified. (See Accounts and Charging on Davinci for more information.)

The NIM web interface is used to view and change your default repo.

You can specify the repo to be charged in your PBS script. Use this keyword:

#PBS -A  repo_name

or, use the -A reponame option to qsub.

Interactive and debug jobs are charged at the regular priority rate.

Batch queues

There are two submit queues on Davinci. The submit queues will route your job to the correct execution queue based on its requirements.

Submit
Queue
Exec
Queue
Max Mem Max CPUs Max Wallclock Max Jobs Max Jobs
per user1
debug debug 6 GB 1 30 mins 4 1
batch small 24 GB 4 18 hours 3 2
medium 48 GB 8 14 hours 2 2
large 72 GB 12 12 hours 1 1

1 There is a maximum of 2 running jobs per user over the whole system.

STDOUT, STDERR buffering

PBS stages standard output and standard error to temporary files that are not written into a user's disk space until the job has completed, You can redirect STDOUT and STDERR from the command line into a file that is visible to you during the run, but this scheme may not work in all situations. NERSC is investigating ways to make this redirection more reliable.

Submitting a job

To submit a job for execution, type

davinci% qsub batchscript

where batchscript is the name of the batch script. The output of the qsub command will include the jobid. Users should record this information, as it is very useful in debugging job failures.

Deleting a job

To delete a previously submitted job, type

davinci% qdel jobid

where jobid is the job's identification, produced by the qsub command.

Job monitoring

Job progress can be monitored with the PBS command qstat:

jacquard% qstat
101-> qstat
Job id           Name             User              Time Use S Queue
---------------- ---------------- ----------------  -------- - -----
6070.jdvin01     fspack            user1            01:12:45 R small
6075.jdvin01      proto            user2            01:02:40 R large
6084.jdvin01     testzz            user3                   0 Q small
6089.jdvin01    ckk.011            user2                   0 Q medium         
6093.jdvin01      rv991            user4                   0 Q large         
6102.jdvin01      rv992            user4                   0 Q large         

LBNL Home
Page last modified: Fri, 05 Oct 2007 18:02:18 GMT
Page URL: http://www.nersc.gov/nusers/systems/davinci/running_jobs.php
Web contact: webmaster@nersc.gov
Computing questions: consult@nersc.gov

Privacy and Security Notice
DOE Office of Science