Batch Scripts

Batch scripts can be used to run a set of commands on a system’s compute partition. The batch script is a shell script containing PBS flags and commands to be interpreted by a shell. The batch script is submitted to the batch manager, PBS, where it is parsed. Based on the parsed data, PBS places the script in the queue as a job. Once the job makes its way through the queue, the script will be executed on the head node of the allocated resources.

Batch scripts are parsed into the following three sections:

  1. Interpreter line
    • The first line of a script can be used to specify the script’s interpreter.
    • This line is optional.
    • If not used, the submitter’s default shell will be used.
    • The line uses the syntax #!/path/to/shell, where the path to the shell may be
      • /usr/bin/csh
      • /usr/bin/ksh
      • /usr/bin/sh
    • If module commands will be used in the batch script, not specifying a shell should allow use of module commands using the submitting user’s passed login environment.
  2. PBS submission options
    • The PBS submission options are preceded by #PBS, making them appear as comments to a shell.
    • PBS will look for #PBS options in a batch script from the script’s first line through the first noncomment line. A comment line begins with #.
    • #PBS options entered after the first noncomment line will not be read by PBS.
  3. Shell commands
    • The shell commands follow the last #PBS option and represent the executable content of the batch job.
    • If any #PBS lines follow executable statements, they will be treated as comments only. The exception to this rule is shell specification on the first line of the script.
    • The execution section of a script will be interpreted by a shell and can contain multiple lines of executables, shell commands, and comments.
    • During normal execution, the batch script will end and exit the queue after the last line of the script.

Example Batch Script

1:
2: #PBS -A XXXYYY
3: #PBS -N test
4: #PBS -j oe
5: #PBS -l walltime=1:00:00,size=16384
6:
7: cd $PBS_O_WORKDIR
8: date
9: aprun -n 16384 ./a.out

This batch script can be broken down into the following sections:

  • Shell interpreter
    • Line 1
      • This line can be used to specify a shell to interpret the script.
  • PBS commands
    • The PBS options will be read and used by PBS upon submission.
    • Lines 2–5
      • 2: The job will be charged to the “XXXYYY” project.
      • 3: The job will be named “test.”
      • 4: The job’s standard output and error will be combined.
      • 5: The job will request 16,384 compute cores (4,096 sockets) for 1 hour.
      • NOTE: Since users cannot share sockets, size requests must be a multiple of 4.

  • Shell commands
    • Once the requested resources have been allocated, the shell commands will be executed on the allocated nodes’ head node.
    • Lines 6–9
      • 6: This line is left blank, so it will be ignored.
      • 7: This command will the change directory to the script’s submission directory.
      • 8: This command will run the date command.
      • 9: This command will run the executable a.out on 16,384 cores (on all four cores of each compute node using aprun).
  • Notice:
    Compute nodes can see only the Lustre work space.

    The NFS-mounted home, project, and software directories are not accessible to the compute nodes.

    • Executables must be executed from within the Lustre work space.
    • Batch jobs can be submitted from the home or work space. If submitted from a user’s home area, the user should cd into the Lustre work space directory prior to running the executable through aprun. An error similar to the following may be returned if this is not done:
              aprun: [NID 94]Exec /tmp/work/userid/a.out failed: chdir /autofs/na1_home/userid
              No such file or directory
    • Input must reside in the Lustre work space.
    • Output must also be sent to the Lustre file system.

    Submitting Batch Jobs

    Batch scripts can be submitted for execution using the qsub command. For example, the following will submit the batch script named test.pbs:

    qsub test.pbs

    If successfully submitted, a PBS job ID will be returned. This ID can be used to track the job. It is also helpful in troubleshooting a failed job, so it is a good idea to make a note of the job ID for each of your jobs.