hpss

Since 2/12/13 01:55 pm

lens

Since 2/13/13 10:20 am

smoky

Since 2/13/13 08:05 am
OLCF User Assistance Center

Can't find the information you need below? Need advice from a real person? We're here to help.

OLCF support consultants are available to respond to your emails and phone calls from 9:00 a.m. to 5:00 p.m. EST, Monday through Friday, exclusive of holidays. Emails received outside of regular support hours will be addressed the next business day.

Writing Batch Scripts for Commodity Clusters

Bookmark and Share
See this article in context within the following user guides: Lens

Batch scripts are used to run a set of commands on a cluster’s compute partition. The batch script is simply a shell script containing options to the batch scheduler software (e.g., PBS) followed by commands to be interpreted by a shell. The batch script is submitted to the batch scheduler software, PBS, where it is parsed. Based on the parsed data, PBS places the script in the queue as a batch job. Once the batch job makes its way through the queue, the script will be executed on the primry compute node of the allocated resources.

Components of a Batch Script

Batch scripts are parsed into the following (3) sections:

Interpreter Line

The first line of a script can be used to specify the script’s interpreter; this line is optional. If not used, the submitter’s default shell will be used. The line uses the hash-bang syntax, i.e., #!/path/to/shell.

PBS Submission Options

The PBS submission options are preceded by the string#PBS, making them appear as comments to a shell. PBS will look for #PBS options in a batch script from the script’s first line through the first non-comment line. A comment line begins with #. #PBS options entered after the first non-comment line will not be read by PBS.

Shell Commands

The shell commands follow the last #PBS option and represent the executable content of the batch job. If any #PBS lines follow executable statements, they will be treated as comments only. The exception to this rule is shell specification on the first line of the script.

The execution section of a script will be interpreted by a shell and can contain multiple lines of executables, shell commands, and comments. Commands within this section will be executed on the batch job’s primary compute node after the job has been allocated. During normal execution, the batch script will end and exit the queue after the last line of the script.

Example Batch Script
  1: #!/bin/bash
  2: #PBS -A XXXYYY
  3: #PBS -N test
  4: #PBS -j oe
  5: #PBS -l walltime=1:00:00,nodes=2:ppn=4
  6:
  7: cd $PBS_O_WORKDIR
  8: date
  9: mpirun -n 8 ./a.out

This batch script can be broken down into the following sections:

Interpreter Line

1: This line is optional and can be used to specify a shell to interpret the script.

PBS Options

2: The job will be charged to the “XXXYYY” project.
3: The job will be named test.
4: The job’s standard output and error will be combined into one file.
5: The job will request (8) total compute cores from (2) unique physical nodes for (1) hour.

Shell Commands

6: This line is left blank, so it will be ignored.
7: This command will change the current directory to the directory from where the script was submitted.
8: This command will run the date command.
9: This command will run the executable a.out on (8) cores via MPI.

Batch scripts can be submitted for execution using the qsub command. For example, the following will submit the batch script named test.pbs:

  qsub test.pbs

If successfully submitted, a PBS job ID will be returned. This ID can be used to track the job. It is also helpful in troubleshooting a failed job,; make a note of the job ID for each of your jobs in case you must contact the OLCF User Assistance Center for support.

Note: For more batch script examples, please see the Batch Script Examples page.