PBS on Phoenix
Contents
The portable batch system (PBS) is the batch-job scheduler for Phoenix.
It also allocates cpus for interactive parallel jobs. This document
provides information for getting started with the batch facilities of
PBS.
Different users may have access to different queues, and different queues may
have different job limits or may target different nodes.
Use the "qstat -q" command to see the current list of queues.
$ qstat -q
server: phoenix1f1.ccs.ornl.gov
Queue Memory CPU Time Walltime Node Run Que Lm State
---------------- ------ -------- -------- ---- ----- ----- ---- -----
x_interactive -- -- -- -- 0 0 -- E R
batch -- -- 12:00:00 -- 0 0 -- E R
sys -- -- -- -- 0 0 -- E R
special -- -- 24:00:00 -- 0 0 -- E R
interactive -- -- 02:00:00 -- 0 0 -- E R
long -- -- 12:00:00 -- 2 1 -- E R
short -- -- 06:00:00 -- 1 1 -- E R
standby -- -- -- -- 0 0 -- E R
immediate -- -- 24:00:00 -- 0 0 -- E R
----- -----
3 2
The "batch" queue is the default queue for jobs submitted as
PBS scripts. Specifically, if you don't specify the queue, then the
default queue is batch.
The only queues that can be submitted to are the routing queues and
they are as follows:
- batch - default
- interactive - for interactive batch work
- special - for use in special situations only, need authorization
- sys - for use by system personnel
The "interactive" queue is only used by adding the
"-I" option to the submittal command.
The other queues (execution queues) are for controlling job throughtput.
To run an interactive job, you use "qsub -I". For
example, to do an interactive job with 1 process, you might use
qsub -I -lwalltime=1:00:00,mppe=1,mem=4Gb
cd $SYSTEM_USERDIR
aprun -n 1 test_example
The first line starts up a new shell in your home directory. Then you
need to "cd" to where ever you want to run, and then use aprun to run
your code just like what you might find in a
batch script.
aprun is the Cray method of starting an application on application
nodes. aprun has a few options that the user must know about:
Frequently Used aprun Parameters
Parameter Format | Definition |
-n |
Specifies the number of MSPs or SSPs |
-c core=unlimited |
Changes the default core size limit to unlimited. Default is 0.
Only do this if running in $SYSTEM_USERDIR. |
|
-d |
Specifies the number of threads per process. |
|
To run a batch job under PBS, you first need to write a job command
file. PBS command files have two components: PBS keyword statements and
shell commands. The LoadLeveler keyword statements are preceded by
"#PBS", making them appear as comments to a shell. The shell
commands follow the last "#PBS" keyword statement and
represent the executable content of the batch job. If any
##PBS" lines follow executable statements, they will be treated
as comments only.
Note that a user's job may not run if the user's start-up files (i.e .cshrc,
.login, or .profile) contain commands which attempt to set terminal
characteristics. Any such command sequence within these files should be
skipped by testing for the environment variable PBS_ENVIRONMENT. You
should also be aware that commands in your startup files should not
generate output when run under PBS. Commands that write to stdout
should not be run for a PBS job. This can be done as shown in the
following sample .login:
...
setenv MANPATH /usr/man:/usr/local/man:$MANPATH
if ( ! $?PBS_ENVIRONMENT ) then
do terminal settings here
run command with output here
endif
If the user's login shell is csh the following message
may appear in the standard output of a job:
Warning: no access to tty, thus no job control in this shell
This message is produced by many csh versions when the shell
determines that its input is not a terminal. Short of modifying
csh, there is no way to eliminate the message. Fortunately,
it is just an informative message and has no effect on the job.
Below you will find an example of a command file, specifying some
typical PBS keywords.
#PBS -N test
#PBS -j oe
#PBS -q batch
#PBS -l walltime=1:00:00,mem=4Gb,mppe=1
cd $SYSTEM_USERDIR
aprun -n 1 ./test
Line 1 shows how to name the job. Line 2 show how one can join stdout
and stderr into a file named "<batch_script_name>.o$PBS_JOBID".
Note that in the example above standard error is added to standard
output. If the order is changed, then standard output is added to
standard error. Line 3 specifies the queue the job will be submitted
to, the default will be batch. Line 4 specifies some resources limits
which are important, like walltime, memory, number of CPUs (MSPs) and
number of SSPs per MSP. See the MPI jobs
section below for more information.
Frequently Used QSUB Parameters
Parameter Format | Definition |
#PBS -A acct |
Causes the job time to be charged to "acct". |
#PBS -a date_time |
Declares the time after which the job is eligible for execution. |
#PBS -q batch |
The '-q' parameter directs the job to a specified queue, in this
case, the 'batch' or default queue |
#PBS -j {eo,oe} |
Causes the standard error and standard output to be combined in
one file.
- eo - standard output is added to standard error
- oe - standard error is added to standard output
|
#PBS -l <resource> |
Resources
- mem - memory , default is 4 GB
- mppe - specifies the maximum number of MSPs used in a job.
This can be used to reserve MSPs for MSP and SSP codes.
This is independent of the mppssp. Default is 0.
- mppssp - specifies the maximum number of SSPs used in a job.
This can only be used for SSP codes.
This is independent of mppe. Default is 0.
(it is NOT threads per process)
- walltime - wall clock time
|
#PBS -m {a,b,e} |
Causes mail to be sent to the user when:
- a - the job aborts
- b - the job begins running
- e - the job ends running
|
#PBS -N name |
Sets the job name to "name" instead of the name of the script file. |
#PBS -o name |
Sets the standard output file to "name" instead of
script_file_name.o$PBS_JOBID. $PBS_JOBID is an environment
variable created by PBS that contains the PBS job identifier. |
#PBS -e name |
Sets the standard error file to "name" instead of
script_file_name.e$PBS_JOBID.
|
#PBS -S <shell> |
Sets the shell to use. Make sure the full path to the shell
is correct.
|
#PBS -V |
Declares that all environment variables are to be exported
to the batch job.
|
#PBS -W |
Used to set job dependencies between two or more jobs.
|
A useful environment variable is PBS_O_WORKDIR. This is set by PBS
when your batch job starts to the directory from where your batch job
was submitted. By default, a PBS batch job starts in your home
directory.
Here is an example command file for a parallel MPI.
#PBS -N test
#PBS -j oe
#PBS -q batch
#PBS -l walltime=1:00:00,mem=16Gb,mppe=4
cd $SYSTEM_USERDIR
aprun -n 4 ./test
This job requires upto an hour of runtime, 16 GB of memory, and 4 MSPs.
This batch script would be used to run a 4 MPI
process executable, with the assumption that the executable multistreams
and thus uses the 4 SSPs available per MSP.
If the executable "test" where an SSP code, then this batch
job would only use 4 SSPs of the 16 that were reserved (as 4 MSPs).
This is perfectly acceptable, but note that you will get charged for
use of 4 MSPs or equivalently 16 SSPs.
Note that if your executable is an SSP code, you could use
a command file that looks like the following:
#PBS -N test
#PBS -j oe
#PBS -q batch
#PBS -l walltime=1:00:00,mem=4Gb,mppssp=4
cd $SYSTEM_USERDIR
aprun -n 4 ./test
This gets 4 SSPs (likely from one MSP but not guaranteed) and since your
code is an SSP executable, the "aprun -n 4" part knows to use
4 SSPs rather than 4 MSPs.
Important:
It is probably the case for most users that only mppe or mppssp be used.
Not both. Specifying both mppe and mppssp indicates that you want
mppe+mppssp resources. The mppe resource can
be used to run MSP or SSP codes, so it will probably be used most.
All PBS-provided environment variable names start with the characters
PBS_ . Some are then followed by a capital O (PBS_O_ ) indicating
that the variable is from the job.s originating environment (i.e. the
user's). The following short example lists some of the more useful
variables, and typical values.
- PBS_O_HOME=/spin/home/<username>
- PBS_O_LOGNAME=<username>
- PBS_O_SHELL=/bin/ksh
- PBS_O_HOST=phoenix1f1.ccs.ornl.gov
- PBS_O_WORKDIR=<directory from where you submitted the job>
- PBS_O_QUEUE=batch
- PBS_O_TZ=EST5EDT
- PBS_JOBNAME=INTERACTIVE
- PBS_JOBID=149.phoenix1f1.ccs.ornl.gov
- PBS_QUEUE=batch
- PBS_ENVIRONMENT=PBS_INTERACTIVE
Use "qsub" to submit a job command file for batch execution.
The job shell will NOT inherit the working directory
from where you submitted the job, so you might want to use
PBS_O_WORKDIR to reference the directory from where the job was
submitted.
Also, unless you use full path names, the standard output and
standard error files will be saved in this same directory.
If you forget to supply a "wall_clock_limit", your job will
get the default limit, regardless of class.
Use "qstat -a" to check the status of submitted jobs.
phoenix1f1.ccs.ornl.gov: ORNL/CCS
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
149.phoenix1f1. user1 short STDIN 813 -- 1 1gb 02:00 R 01:10
150.phoenix1f1. user2 batch job1 860 -- 4 4gb 12:00 R 00:54
151.phoenix1f1. user3 batch test 898 -- 16 4gb 12:00 R 00:32
152.phoenix1f1. user4 batch runit 914 -- 4 1gb 08:00 R 00:01
153.phoenix1f1. user1 sys job -- -- 4 4gb 12:00 Q --
154.phoenix1f1. user5 batch n16 -- -- 16 64gb 24:00 W --
The first column is the id of each job (which has been truncated)
and the second column is the owner.
The "S" column gives the status of
each job. Here are some common status values.
E | Job is exiting after having run |
H | Held |
Q | Queued, eligible to run |
R | Running |
S | Job is suspended |
T | Job is being moved to new location |
W | Waiting for its execution time |
You can use "qdel" with a job id to cancel that
job. The command removes waiting jobs and aborts running jobs.
$ qdel 12816
You can also keep a job from running without removing it from
PBS using "qhold <jobid>" with a list of job
names. You can then use "qrls <jobid>" to release held
jobs and allow them to run.
One can also change the order in which two of the user's jobs
are processed using the "qorder" command.
Phoenix has "man" pages for each of the PBS commands as well
as a PBS man page.
phoenix
| ram
| cheetah
| eagle
|