Biowulf at the NIH
SAS on Biowulf
SAS Logo

Base SAS provides a scalable, integrated software environment specially designed for data access, transformation and reporting. It includes a fourth-generation programming language; ready-to-use programs for data manipulation, information storage and retrieval, descriptive statistics and report writing; and a powerful macro facility that reduces programming time and maintenance issues. The Base SAS windowing environment provides a full-screen facility for interacting with all parts of a SAS program. On-line help is also available.

With multithreaded capabilities, Base SAS software can take advantage of parallel processing power to maximize use of computing resources. However, the main advantage of using SAS on Biowulf would be to run many SAS jobs simultaneously in batch mode as a 'swarm' of single-threaded jobs.

SAS on Biowulf is a limited resource. The PBS Batch System keeps track of the SAS resource. Jobs requiring SAS will be started or remain in the queue depending on whether there are SAS resources available. All SAS jobs must therefore specify the resource when the job is submitted, as in the examples below. At present there is a limit of 48 simultaneous SAS processes, but this number is subject to change.

Running a swarm of SAS jobs

The swarm program is designed to submit a group of commands to the Biowulf cluster. Each command is represented by a single line in the swarm command file that you create, and runs as a separate batch job. See the swarm page for more information.

Create a swarm command file, named sasjobs for instance, with each line containing a single sas run. Example:

sas sasjob1.sas
sas sasjob2.sas
sas sasjob3.sas
sas sasjob4.sas
sas sasjob5.sas
[...]
sas sasjob100.sas
Submit this to the batch system with the command:
swarm -f sasjobs -l nodes=1,sas=4
This command submits a swarm of SAS processes (100 SAS commands in the example swarm command file) to the batch system. swarm will send 4 SAS processes to a single node, so 4 SAS resources have been requested for each swarm job. Depending on the availability of the SAS resources on Biowulf, some jobs may start immediately and some may remain in the queue. For jobs that remain queued, the command 'qstat -f jobnumber' (e.g. qstat -f 131855.biobos) will indicate that the job is waiting for the SAS resource. e.g.
<biowulf> qstat -f 1389167
Job Id: 1389167.biobos
Job_Name = swarm13n959
[...]
comment = Insufficient amount of resource sas

Running SAS interactively

If you simply log in to Biowulf and give the sas command, the SAS Workspace environment will run on the main Biowulf login node. This is discouraged because the login node is not intended for running applications. To run SAS interactively you can allocate a node for interactive use. Once the node is allocated, you can type commands directly on the command-line. Interactive jobs are appropriate during testing, but long jobs or large numbers of jobs should be submitted as swarms or as regular batch jobs. Example:

biowulf% qsub -I -l nodes=1,sas=1
qsub: waiting for job 2011.biobos to start
qsub: job 2011.biobos ready

p139$ sas foobar.sas
p139$ exit
logout

qsub: job 2011.biobos completed
biowulf$ 

With an Xwindows-capable connection, typing 'sas' at the Biowulf prompt will bring up the SAS Workspace X windows interface. You can then type any SAS command into the window. Again, this should only be used for testing, or for small development tasks. All other SAS jobs should be run either via batch or interactive nodes, as below.

biowulf% qsub -I -l nodes=1,sas=1 -V
qsub: waiting for job 423061.biobos to start
qsub: job 423061.biobos ready

[joe@p955 ~]$ sas

You should see the SAS logo pop up briefly, and then the menus will appear.

SAS splash screen

SAS GUI

Documentation