Structual Biology on the Biowulf Cluster

AMBER

CHARMM

GROMACS

NAMD

APBS

GAMESS

GAUSSIAN03

Q-chem

PROSPECT

HADDOCK

CNS, XPLOR-NIH

AMoRe

PovRay

VMD

RasMol

Rosetta

ZDOCK

nest

Autodock

PROCHECK

DSSP

Benchmarks for MD (AMBER, CHARMM, and NAMD --http://brooks.scripps.edu/charmm_docs/Benchmarks/chm_amb_namd.html)

All applications run on Biowulf must submitted through qsub. For short tests, an interactive session can be started with the -I flag, but all long runs (greater than 30 minutes) should be submitted to the regular batch queue.

Quick qsub tutorial

A script containing commands is created:

myjob.sh:

#!/bin/bash
myprog < /data/me/mydata

This is submitted with qsub:

qsub -l nodes=1 myjob.sh

Minimally, the number of nodes must be supplied with the -l nodes=1 options. More precise properties required can be added:

qsub -l nodes=1:o2200:myr2k:m2048 myjob.sh

Node properties:

faste
fast ethernet (100 Mb/s) interconnect

gige
gigabit ethernet (1 Gb/s) interconnect

myr2k
Myrinet (2 Gb/s) interconnect

ib
Infiniband (10 Gb/s) interconnect

m1024
1 GB memory

m2048

2 GB memory

m4096
4 GB memory

p2800
2.8 GHz Intel Xeon

o2000
2.0 GHz AMD Opteron 246

o2200
2.2 GHz AMD Opteron 248

o2600
2.6 GHz AMD Opteron 285, dual-core (4 CPU)

o2800
2.8 GHz AMD Opteron 254

altix
SGI Altix 350 (see Firebolt page for more information)

x86-64

o2000 + o2200 nodes + o2800 nodes

dc

dual-core (o2600) nodes

centos

2.8 GHz dual-core (o2800) nodes running CentOS 4.2

Other options:

-N name Declare a name for the job

-m mail_options Send mail to user upon 'a' (abort), 'b' (begin), 'e' (end)

-k keep e = standard error, o = standard output

-s path_list Declares the shell that interprets the job

-v variable_list Export named environment variables to hosts running job

-V Export all environment variables

All options (except for -l nodes=... option) can be placed within the qsub script:

myjob.sh:

#!/bin/bash
#PBS -N MyJob
#PBS -m be
#PBS -k oe
#PBS -V
#PBS -s /bin/sh
myprog < /data/me/mydata

PBS-specific environment variables:

$PBS_O_HOST name of the host upon which the qsub command is running

$PBS_O_QUEUE
name of the original queue to which the job was submitted

$PBS_O_SYSTEM operating system name given by uname -s on $PBS_O_HOST

$PBS_O_WORKDIR absolute path of the directory from which the qsub command was given

$PBS_ENVIRONMENT either PBS_BATCH or PBS_INTERACTIVE

$PBS_JOBID job identifier assigned to the job by the batch system

$PBS_JOBNAME job name supplied by the user

$PBS_NODEFILE pathname of the file containing the list of nodes assigned to the job

$PBS_QUEUE name of the queue from which the job is executed

Monitoring and deleting qsub jobs

Monitor through the web: http://biowulf.nih.gov/sysmon/

Monitor interactively:

[biowulf]$ freen
        m1024   m2048   m4096   m8192   Total
----------------- GeneralPool -----------------
o2800   /       /       0/210   /       0/210
o2200   /       22/232  0/58    /       22/290
o2000   /       17/40   /       /       17/40
p2800   2/79    91/195  0/62    /       93/336
-----------------   Myrinet   -----------------
o2200   /       34/71   /       /       34/71
o2000   38/47   /       /       /       38/47
p2800   37/38   /       /       /       37/38
----------------- Infiniband  -----------------
o2800   /       /       14/93   /       14/93
-----------------  Reserved   -----------------
o2800   /       46/89   /       27/34   73/123
o2600   /       /       46/274  /       46/274
------------------- Altix --------------------
Available: 15 processors, 15.2 GB memory

qstat -u displays simple list of jobs for a single user:

[biowulf]$ qstat -u me

biobos:
                                                            Req'd  Req'd   Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
999999.biobos   me       norm     MyJob        8713   1   1    --    --  R 99:99

qstat -f jobid displays a detailed report of a single job:

[biowulf]$ qstat -f 999999.biobos
Job Id: 999999.biobos
    Job_Name = MyJob
    Job_Owner = me@p1397
    resources_used.cpupercent = 98
    resources_used.cput = 23:26:24
    resources_used.mem = 9452kb
    resources_used.ncpus = 1
    resources_used.vmem = 152328kb
    resources_used.walltime = 23:26:56
    job_state = R
    queue = norm
    server = biobos
    Checkpoint = u
    ctime = Wed Jan  4 14:25:35 2006
    Error_Path = p1397:/home/me/MyJob.e999999
    exec_host = p295/0
    Hold_Types = n
    Join_Path = oe
    Keep_Files = n
    Mail_Points = ae
    mtime = Wed Jan  4 14:27:00 2006
    Output_Path = p1397:/home/me/MyJob.o999999
    Priority = 0
    qtime = Wed Jan  4 14:25:35 2006
    Rerunable = True
    Resource_List.ncpus = 1
    Resource_List.neednodes = 1:faste
    Resource_List.nodect = 1
    Resource_List.nodes = 1:faste
    session_id = 8713
    Variable_List = PBS_O_HOME=/home/me,PBS_O_LANG=en_US,
        PBS_O_LOGNAME=me,PBS_O_PATH=/bin:/usr/bin,PBS_O_SHELL=/bin/bash,
        PBS_O_HOST=p1397,PBS_O_WORKDIR=/home/me
        PBS_O_SYSTEM=Linux,PBS_O_QUEUE=vlong
    comment = Job run at Wed Jan 04 at 14:26
    etime = Wed Jan  4 14:25:35 2006

qdel jobid kills the job

[biowulf]$ qdel 999999.biobos

Sometimes you have to push:

[biowulf]$ qdel -W force 999999.biobos

Delete all your jobs:

[biowulf]$ qdel -W force `qselect -u me`

The swarm command

MPI and multirun

http://www-unix.mcs.anl.gov/mpi/

MPI is a library specification for message-passing, proposed as a standard by a broadly based committee of vendors, implementors, and users. A program is compiled using MPICH (TCP/IP) or MPICH-GM (Myrinet GM), and the program is run using the command mpirun:

mpirun -nolocal -machinefile $PBS_NODEFILE -np 8 MyProg

-nolocal specifies to not run the processes on the current node (Biowulf head node)
-machinefile $PBS_NODEFILE specifies which nodes to run on
-np 8 specifies running 8 processes; this is suitable for running on 4 dual-processor nodes

Here is a typical batch command file to run an MPI-compiled program (AMBER):

amber.run:

#!/bin/csh
#PBS -N sander
#PBS -m be
#PBS -k oe


set path = (/usr/local/mpich-pg/bin $path )
set file=/data/me/amber/dinuc_test


cd /data/me/amber/nomyri
date


mpirun -machinefile $PBS_NODEFILE -np $np /usr/local/amber/exe.mpich-pg/sander \
-i $file.in -o $file.out -p $file.top -c $file.coor -x $file.crd -e $file.en \
-inf $file.info -r $file.rst

This script can be submitted with the qsub command

[biowulf]$ qsub -v np=8 -l nodes=4:o2200 amber.run

The multirun command

Similar to swarm, but more controlled (and oftentimes less efficient), creating a single job with unified STDOUT and STDERR output files.

1. Create an executable shell script which will run multiple instances of your program (run6.sh):

#!/bin/csh
switch ($MP_CHILD)
  case 0:
    MyProg < args0
  breaksw
  case 1:
    MyProg < args1
  breaksw
  case 2:
    MyProg < args2
  breaksw
  case 3:
    MyProg < args3
  breaksw
  case 4:
    MyProg < args4
  breaksw
  case 5:
    MyProg < args5
  breaksw
endsw

2. Use mpirun in your batch command file (MyJob.sh) to run the mpi shell program (run6.sh):

#!/bin/tcsh
#PBS -N MyJob
#PBS -m be
#PBS -k oe

set path=(/usr/local/mpich/bin $path)
mpirun -machinefile $PBS_NODEFILE -np 6 \
 /usr/local/bin/multirun -m /home/me/run6.sh

3. Submit the job to the batch system:

[biowulf]$ qsub -l nodes=3 MyJob.sh

The term "large-scale" here refers to repetively executing a series of programs on a large number of individual inputs (protein structures, nucleotide sequences, data sets, etc.).

Practical tips for parallelizing jobs using scripts

Managing I/O, memory, and disk space requirements

Visualizing results

This document is available as http://helix.nih.gov/talks/strbio.html
Biowulf home page | Helix Systems | NIH
Last modified: 05 Dec 2007

home page	http://amber.scripps.edu/
version	9.0
type	molecular dynamics
ease-of-use	*
documentation	http://biowulf.nih.gov/apps/amber.html
parallelized?	yes
myrinet?	yes
scaling	8-16 cpu

home page	http://www.charmm.org
version	27-34, others
type	molecular dynamics
ease-of-use	*
documentation	http://biowulf.nih.gov/apps/charmm/index.html
parallelized?	yes
myrinet?	yes
scaling	16 cpu

home page	http://www.gromacs.org
version	3.3.1,3.2.1
type	molecular dynamics
ease-of-use	***
documentation	http://biowulf.nih.gov/apps/gromacs/index.html
parallelized?	yes
myrinet?	yes
scaling	10-20 cpu

home page	http://www.ks.uiuc.edu/Research/namd
version	2.6
type	molecular dynamics
ease-of-use	***
documentation	http://biowulf.nih.gov/apps/namd/index.html
parallelized?	yes
myrinet?	no
scaling	4-32+ cpu

home page	http://apbs.sourceforge.net
version	0.5.0
type	electrostatics
ease-of-use	**
documentation	http://biowulf.nih.gov/apps/apbs.html
parallelized?	no
myrinet?	no
scaling	n/a

home page	http://www.msg.ameslab.gov/GAMESS/
version	Mar. 2007
type	quantum chemistry
ease-of-use	***
documentation	http://biowulf.nih.gov/apps/gamess.html
parallelized?	yes
myrinet?	no
scaling	8 cpu?

home page	http://www.gaussian.com/g03.htm
version	D02
type	quantum chemistry
ease-of-use	***
documentation	http://biowulf.nih.gov/apps/gaussian/
parallelized?	no
myrinet?	no
scaling	n/a

home page	http://www.q-chem.com/
version	2.1
type	quantum chemistry
ease-of-use	**
documentation	http://biowulf.nih.gov/apps/q-chem.html
parallelized?	yes
myrinet?	no
scaling	?

home page	http://compbio.ornl.gov/structure/prospect2/index.html
version	2.0
type	structure prediction
ease-of-use	***
documentation	http://biowulf.nih.gov/apps/prospect_guide.html
parallelized?	yes
myrinet?	no
scaling	64+ cpu

home page	http://www.nmr.chem.uu.nl/haddock/
version	2.0
type	structure prediction
ease-of-use	*
documentation	http://biowulf.nih.gov/apps/haddock_biowulf.html
parallelized?	yes
myrinet?	no
scaling	?

home page	http://cns.csb.yale.edu/v1.1/
version	1.1
type	structure determination and refinement
ease-of-use	*
documentation	http://helix.nih.gov/apps/structbio/cns.html, http://biowulf.nih.gov/apps/xplor-nih.html
parallelized?	yes and no
myrinet?	no
scaling	?

home page	http://www.mbg.duth.gr/~glykos/Qs.html
version	1.3
type	structure determination and refinement
ease-of-use	***
documentation	http://biowulf.nih.gov/apps/Qs/index.html
parallelized?	no
myrinet?	no
scaling	n/a

home page	http://www.gv.cnrs-gif.fr/english/vs2-english.html
version	n/a
type	structure determination and refinement
ease-of-use	**
documentation	http://biowulf.nih.gov/apps/amore/index.html
parallelized?	no
myrinet?	no
scaling	n/a

home page	http://www.povray.org/
version	3.1,3.6
type	visualization
ease-of-use	**
documentation	http://biowulf.nih.gov/apps/povray/index.html
parallelized?	yes and no
myrinet?	no
scaling	n/a

home page	http://www.ks.uiuc.edu/Research/vmd/current/docs.html
version	1.8.6
type	visualization
ease-of-use	****
documentation	http://helix.nih.gov/Applications/vmd.html
parallelized?	no
myrinet?	no
scaling	n/a

home page	http://www.umass.edu/microbio/rasmol/
version	2.7.2.1
type	visualization
ease-of-use	***
documentation	http://www.openrasmol.org/
parallelized?	no
myrinet?	no
scaling	n/a

home page	http://www.rosettacommons.org/
version	2.2
type	protein structure prediction and modeling
ease-of-use	*
documentation	http://biowulf.nih.gov/apps/Rosetta.html
parallelized?	no
myrinet?	no
scaling	n/a

home page	http://zlab.bu.edu/zdock/index.shtml
version	2.3
type	protein modeling
ease-of-use	***
documentation	http://biowulf.nih.gov/apps/zdock.html
parallelized?	yes
myrinet?	no
scaling	up to 32 cpu

home page	http://wiki.c2b2.columbia.edu/honiglab_public/index.php/Software:nest
version	n/a
type	homology modeling
ease-of-use	*
documentation	http://wiki.c2b2.columbia.edu/honiglab_public/index.php/Software:nest
parallelized?	no
myrinet?	no
scaling	n/a

home page	http://autodock.scripps.edu/
version	3.0.5/td>
type	protein modeling
ease-of-use	**
documentation	http://biowulf.nih.gov/apps/autodock.html
parallelized?	no
myrinet?	no
scaling	n/a

home page	http://www.biochem.ucl.ac.uk/~roman/procheck/procheck.html
version	3.5
type	structure analysis
ease-of-use	**
documentation	http://biowulf.nih.gov/apps/procheck/
parallelized?	no
myrinet?	no
scaling	n/a

home page	http://swift.cmbi.ru.nl/gv/dssp/
version	Nov. 2002, CMBI version
type	structure analysis
ease-of-use	***
documentation	n/a
parallelized?	no
myrinet?	no
scaling	n/a

faste	fast ethernet (100 Mb/s) interconnect
gige	gigabit ethernet (1 Gb/s) interconnect
myr2k	Myrinet (2 Gb/s) interconnect
ib	Infiniband (10 Gb/s) interconnect
m1024	1 GB memory
m2048	2 GB memory
m4096	4 GB memory
p2800	2.8 GHz Intel Xeon
o2000	2.0 GHz AMD Opteron 246
o2200	2.2 GHz AMD Opteron 248
o2600	2.6 GHz AMD Opteron 285, dual-core (4 CPU)
o2800	2.8 GHz AMD Opteron 254
altix	SGI Altix 350 (see Firebolt page for more information)
x86-64	o2000 + o2200 nodes + o2800 nodes
dc	dual-core (o2600) nodes
centos	2.8 GHz dual-core (o2800) nodes running CentOS 4.2

-N name	Declare a name for the job
-m mail_options	Send mail to user upon 'a' (abort), 'b' (begin), 'e' (end)
-k keep	e = standard error, o = standard output
-s path_list	Declares the shell that interprets the job
-v variable_list	Export named environment variables to hosts running job
-V	Export all environment variables

$PBS_O_HOST	name of the host upon which the qsub command is running
$PBS_O_QUEUE	name of the original queue to which the job was submitted
$PBS_O_SYSTEM	operating system name given by uname -s on $PBS_O_HOST
$PBS_O_WORKDIR	absolute path of the directory from which the qsub command was given
$PBS_ENVIRONMENT	either PBS_BATCH or PBS_INTERACTIVE
$PBS_JOBID	job identifier assigned to the job by the batch system
$PBS_JOBNAME	job name supplied by the user
$PBS_NODEFILE	pathname of the file containing the list of nodes assigned to the job
$PBS_QUEUE	name of the queue from which the job is executed