CHARMM (Chemistry at HARvard Macromolecular Mechanics):
|
CHARMM on Biowulf is built and documented by Rick Venable, NHLBI.
A couple cover scripts are now available in /usr/local/bin for general use, setup to run several CHARMM versions for multiple sizes, with appropriate MPI libraries for ethernet, Myrinet and Infiniband. The commands are mpicharmm, which is designed to be used in scripts submitted via the PBS qsub command, and qcharmm, which invokes qsub and passes arguments to a secondary script (named runcharmm). The syntax of both is nearly identical, except that qcharmm has an additional required argument, which is the input file prefix; for that, the .inp extension is assumed, and the output file is created with the .out extension.
Both cover scripts have required arguments; simply typing the name of either w/o args prints brief syntax messages.
There is also a provision for the use of private CHARMM versions. The scripts may change over time as the Biowulf system and CHARMM evolve; changes are documented in the Revision Notes section.
qcharmm
For submitting single CHARMM jobs, qcharmm consolidates qsub, mpirun, and the chosen CHARMM executable into a single command interface. Examples:To submit and run the input min-prot.inp with the default version (c31b2) on ethernet with 2 procs, creating the log file min-prot.out
qcharmm proc=2 prfx=min-prot
As above, but c32b2 for 8 procs, Myrinet 2000, and the large size
qcharmm proc=8 prfx=min-prot vrsn=c32b2 comm=myr2k size=large
To submit an analysis run (bkbnrms.inp), passing the CHARMM variable ("parameter") N set to 50 to the input script (a reference to @N in the script is replaced by the number 50)
qcharmm proc=single prfx=bkbnrms N:50
Use Infiniband with 8 procs, and xxlarge size
qcharmm proc=8 prfx=min-prot comm=ib size=xxlarge
Run a 64-bit executable; required for analysis of Infiniband simulations
qcharmm proc=0 word=64 prfx=bkbnrms N:50
Finally, it should be noted that CHARMM does not perform well in parallel using TCP/IP over ethernet, so qcharmm will balk at running more than 2 parallel processes on ethernet-only nodes. See the Biowulf Users Guide and the output of the shnodes command for more information on node memory sizes and usage.
mpicharmm
The other command, mpicharmm, is designed for use in scripts, and requires the user to know how to use PBS qsub. Like qcharmm, it allows CHARMM variables ("parameters") to be set on the command line via the colon syntax (e.g. N:50). A simple example is the following, called run.csh:#!/bin/csh #PBS -N rundyn #PBS -j oe #PBS -m ae # CHANGE TO SUBMISSION SUBDIR cd $PBS_O_WORKDIR # INVOKE CHARMM mpicharmm proc=2 < dyn.inp >& dyn.out The above script is run on Biowulf via the PBS command qsub -l nodes=1 run.csh Likewise, for 8 procs on Myrinet: #!/bin/csh #PBS -N rundyn #PBS -j oe #PBS -m ae # CHANGE TO SUBMISSION SUBDIR cd $PBS_O_WORKDIR # INVOKE CHARMM mpicharmm proc=8 < dyn.inp >& dyn.out The above script is run on Biowulf via the PBS command qsub -l nodes=4:myr2k run.csh Same case, but for Infiniband: #!/bin/csh #PBS -N rundyn #PBS -j oe #PBS -m ae # CHANGE TO SUBMISSION SUBDIR cd $PBS_O_WORKDIR # INVOKE CHARMM mpicharmm proc=8 prfx=dyn >& dyn.out The above script is run on Biowulf via the PBS command qsub -l nodes=4:ib run.csh Note that the prfx= option is required for Infinband usage, due to changes in the MPICH library used.Note that you CANNOT pass arguments directly to run.csh via the 'qsub' command line; it is possible to pass variables, however, via the -v option to 'qsub'. For a detailed variable passing example, see the qcharmm and runcharmm scripts themselves (in /usr/local/bin).
The CHARMM X11 graphics commands can also be used via mpicharmm; a pre-requisite is to login to biowulf using the ssh command. From some systems, the -X option to ssh is needed to setup use of X11, e.g.
ssh -X biowulf
The next step is to request an "interactive batch" node, via
qsub -l nodes=1 -I -V
Finally, start CHARMM via
mpicharmm proc=single
At this point, it's useful to have one or two CHARMM stream files that do all the system setup, e.g. read in RTF, PARAM, PSF and COOR file(s). Then one can simply type
* a title
*
stream init-prot.str
graphx
draw sele .not. type H* end
center
scale 0.5
... (etc.)
Graphics features include trajectory file viewing, and output of coordinate data as a file of POV-Ray objects. The on-screen graphics do not work correctly in 64-bit mode, but POV files can still be produced from the trajectory files.
Command syntax listings
The following listings show the result of typing either command w/o any of the required arguments:biobos ~ [7] qcharmm (qcharmm) ERROR: Arg prfx= is REQUIRED qcharmm prfx=STR [proc=N] [vrsn=STR] [size=STR] [comm=STR] [fort=STR] [word=64] ARGS, DEFAULT INDICATED WITH *: prfx=STR CHARMM input file prefix; required proc=N N = integer no. of processors; defaults to single vrsn=STR version; c28b2, c29b2, c30b2, c31b1, c31b2*, c32b1, c32b2 fort=STR version of Fortran; g77, pgi*, path (c32b2) word=64 use 64-bit executable; forces fort=path vrsn=c32b2 size=STR medium*, large, xlarge, xxlarge (c32b2,path,64) ---------- Job Properties ------------- rest=STR n or y, default n; PBS restart quik=STR n or y, default n; use "quick" queue (< 60 min) name=STR name for job in PBS queue; up to 12 chars, no space or dash ---------- Node Properties ------------- comm=STR eth*, myr2k, ib; communications interface type mmry=STR RAM, no default; m1024, m2048, m4096 node=STR node options, no default; p1400 p1800 p2800 k8 o2000 o2200 o2800 linux22 gige (1) N = integer ; STR = character string (2) Equals sign required, case sensitive, no spaces *** Be sure to specify comm=myr2k for Myrinet *** Using comm=ib (Infiniband) forces vrsn=c32b2 fort=path word=64 Set the env var CHMEXE to use a private executable. Use proc=single or proc=0 for a single-threaded (non-MPI) version; default. Not all combinations are valid biowulf ~ [6] mpicharmm (mpicharmm) ERROR: Arg proc= REQUIRED mpicharmm proc=N [vrsn=STR] [size=STR] [comm=STR] [fort=STR] [word=64] [prfx=STR] ARGS, DEFAULT INDICATED WITH *: proc=N N = no. of processors; required, 0 for non-parallel vrsn=STR version; c28b2, c29b2, c30b2, c31b1, c31b2*, c32b1, c32b2 fort=STR version of Fortran; g77, pgi*, path (c32b2) size=STR medium*, large, xlarge, xxlarge (64-bit) word=64 use a 64-bit executable (c32b2,path) comm=STR eth, myr2k, ib; defaults to node type (qsub) prfx=STR input file prefix, required for Infiniband (ib) (1) N = integer ; STR = character string (2) Equals sign required, case sensitive, no spaces Set the env var CHMEXE to use a private executable. Use proc=single or proc=0 for a single-threaded (non-MPI) version. Not all combinations are valid Infiniband (qsub -l nodes=N:ib) forces vrsn=c32b2 fort=path word=64
The following listing indicates the currently available CHARMM versions, sizes with atom limits, and parallel communciations interfaces supported; c.f. the syntax listings above. The CHARMM versions are:
- c27b3
- not so recent beta version of CHARMM v. 27
- c28b2
- last beta version of CHARMM v. 28
- c29b2
- last beta version of CHARMM v. 29
- c30b2
- last beta version of CHARMM v. 30
- c31b2
- default; final beta version of CHARMM v. 31
- c32b2
- last beta version of CHARMM v. 32
- c33b2
- 2nd beta version of CHARMM v. 33
- eth
- uses the ethernet interface for MPI parallel communications
- myr2k
- uses the Myrinet 2000 interface for MPI parallel communications
- ib
- uses the Infiniband interface for MPI parallel communications; 64-bit
- single
- no MPI parallel communications, i.e. single-threaded; all but c27b3 g77
- medium
- default if not specified via size= option; 25,000 atom limit
- large
- 60,000 atom limit; default for c32b2
- xlarge
- 240,000 atoms limit
- xxlarge
- 360,000 atom limit; 64-bit only (c32b2)
- g77
- compiled with GNU g77 compiler, and appropriate MPICH libs
- ifort
- compiled with Intel ifort compiler, and appropriate MPICH libs; c33b2 only
- gfort
- compiled with GCC4 gfortran compiler, and appropriate MPICH libs; c33b2 only
- pgi
- compiled with Portland Group pgf77 compiler, and appropriate MPICH libs; default, much better performance
- path
- compiled with Pathscale compiler, and appropriate MPICH libs; both 32-bit and 64-bit, default for Infiniband
Academic CHARMM executables with different features and sizes are available
via /usr/local/charmm, as well as the doc, toppar, test, and support subdirs,
e.g.:
biobos [117] ( cd /usr/local/charmm/c30b1; ls -F ) ChangeLog.c30 doc/ g77-lrg* g77-lrg.gm2* g77-lrg.one* g77-med* g77-med.gm2* g77-med.one* g77-xlg* g77-xlg.gm2* g77-xlg.one* pgi-lrg* pgi-lrg.gm2* pgi-lrg.one* pgi-med* pgi-med.gm2* pgi-med.one* pgi-xlg* pgi-xlg.gm2* pgi-xlg.one* support/ test/ toppar/The executables (indicated with *) are named according to the pattern COMPILER-SIZE for the g77 and PGI compilers, with med, lrg, and xlg for medium (25K atoms), large (60K atoms), and xlarge (240K atoms), respectively. The pgi-* versions are recommended, the code runs 20-30% faster; the g77 versions were kept for troubleshooting purposes. The .gm2 extensions indicates the Myrinet MPI libraries were used instead of the standard TCP/IP MPI libraries used for ethernet. The .gm2 versions should only be used for jobs submitted to Biowulf via the -l nodes=N:myr2k options to the PBS 'qsub' command (where N is some integer).
Here is a more detailed usage example for mpicharmm illustrating the use of paired scripts to keep a simulation running for an extended period w/o intervention. Several points need to made about these scripts:
- It assumes the use of generic output file names, e.g. dyn.trj dyn.out
- The initiating script (which calls 'qsub') may be renamed to break the chain gracefully at the end of a run
- Almost all job failures will break the chain; one known (rare) exception, which writes a bad restart file, and causes the next job to fail and break the chain
- Killing a running job (via 'qdel') will also break the chain
Samples and variations of the csh and CHARMM scripts discussed below can also be found in the /usr/local/charmm/scripts subdir.
First, there's the initiating script, used to start the process, which we'll call biowulf.com:
#!/bin/csh -f qsub -l nodes=4:myr2k dyn.cshMore important (and more complicated) is the dyn.csh script, which runs CHARMM, checks the output files to test for a successful run, renames each output file by adding a sequence number, and submits the next run to the queue. For a failed run, the output file is renamed to dyn.err.TS, where TS is a timestamp. This script has been extensively tested and used on many systems with appropriate modifcations, including LoBoS, Galaxy, and Biowulf.
#! /bin/csh -f #PBS -N RunDyn #PBS -j oe # ASSUMPTION (1): input files are named dynXX.inp where XX is optional # useful for startup phase; for example, to start using dynstrt.inp, # dyn.csh strt # would be used; for production dynamics runs, simply dyn.csh # ASSUMPTION (2): output files are named dyn.res dyn.trj dyn.out # ASSUMPTION (3): previous restart file read as dyn.rea # cd $PBS_O_WORKDIR set chm = "mpicharmm proc=8 size=large" if ( -e next.seqno ) then $chm < dyn.inp >& dyn.out else $chm < dynstrt.inp >& dyn.out endif set okay = true # TEST FOR EXISTENCE, THEN NONZERO LENGTH OF OUTPUT FILES if ( -e dyn.res && -e dyn.trj ) then @ res = `wc dyn.res | awk '{print $1}'` @ tsz = `ls -s dyn.trj | awk '{print $1}'` @ nrm = `grep ' NORMAL TERMINATION ' dyn.out | wc -l` if ( $res > 100 && $tsz > 0 && $nrm == 1 ) then # SUCCESSFUL RUN; COPY RESTART FILE cp dyn.res dyn.rea # DETERMINE RUN NUMBER if ( -e next.seqno ) then @ i = `cat next.seqno` else @ i = 1 endif # ENABLE/DISABLE ANALYSIS EVERY 2ND RUN # if ( 0 == $i % 2 ) qsub -l nodes=1 chk.csh # NUMBER THE OUTPUT FILES; CHANGE EXTENSIONS TO SUIT APPLICATION foreach fil ( out res trj ) mv dyn.$fil dyn$i.$fil end gzip dyn$i.out dyn$i.res # CONDITIONAL END CHECK if ( -e last.seqno ) then @ l = `cat last.seqno` if ( $i == $l ) goto endrun endif # SUBMIT THE NEXT JOB ./biowulf.com endrun: @ i += 1 echo $i > next.seqno else # ZERO LENGTH FILE(S) set okay = false endif else # FILE DOESN'T EXIST set okay = false endif # TEST FOR CHARMM RUN FAILED; CREATE .ERR FILE WITH TIMESTAMP if ( $okay == true ) then else set ts = `date +%m%d.%H%M` mv dyn.out dyn.err.$ts exit(201) endifTo complete the setup, the CHARMM input files need to use the generic names; first, there is dynstrt.inp used only for the first run:
* start NPAT melittin in dopc bilayer sim at high hydration * bomlev -1 stream inim2p1.str ! SETUP SHAKE shake fast bonh param open unit 41 write file name dyn.trj open unit 51 write card name dyn.res prnlev 3 NODE 0 dyna cpt start nstep 50000 timestep 0.001 - pcons pint pref 1.0 pmzz 2500. pmxx 0.0 pmyy 0.0 - hoover reft 303.15 tmass 20000. - inbfrq 20 atom vatom cutnb 14.0 ctofnb 11. cdie eps 1. - ctonnb 7. vswitch cutim 14.0 imgfrq 20 wmin 1.0 - ewald pmew fftx 48 ffty 48 fftz 64 kappa .34 spline order 6 - iprfrq 5000 ihtfrq 0 ieqfrq 0 ntrfrq 1000 - iuncrd 41 iunrea -1 iunwri 51 kunit -1 - nprint 100 nsavc 500 nsavv 0 ihbfrq 0 ilbfrq 0 - firstt 303.15 finalt 303.15 teminc 10.0 tstruct 303.15 tbath 303.15 - iasors 1 iasvel 1 iscvel 0 ichecw 0 twindh 5.0 twindl -5.0 stopFinally, there is the truly generic dyn.inp file, which is run repeatedly until the desired simulation time duration is achieved:
* melittin in dopc lipid bilayer NPAT at high hydration * bomlev -1 stream inim2p1.str ! SETUP SHAKE shake fast bonh param open unit 31 read card name dyn.rea open unit 41 write file name dyn.trj open unit 51 write card name dyn.res prnlev 3 node 0 dyna cpt restart nstep 50000 timestep 0.001 - pcons pint pref 1.0 pmzz 2500.0 pmxx 0.0 pmyy 0.0 - inbfrq 20 atom vatom cutnb 14. ctofnb 11. cdie eps 1. - ctonnb 7. vswitch cutim 14.0 imgfrq 20 wmin 1.0 - ewald pmew fftx 48 ffty 48 fftz 64 kappa .34 spline order 6 - hoover reft 303.15 tmass 20000. - iprfrq 5000 ihtfrq 0 ieqfrq 0 ntrfrq 1000 - iuncrd 41 iunrea 31 iunwri 51 kunit -1 - nprint 100 nsavc 500 nsavv 0 ihbfrq 0 ilbfrq 0 - firstt 303.15 finalt 303.15 teminc 10.0 tstruct 303.15 - iasors 1 iasvel 1 iscvel 0 ichecw 0 twindh 5.0 twindl -5.0 stopA key thing to note is that dyn.rea is always the restart file created from the preceding run (see dyn.csh). An added feature is the use of gzip to compress output and restart files, saving considerable disk space. Compressed files may be browsed easily via e.g.
zcat dyn2.out.gz | less
There is also a provision to use both qcharmm and mpicharmm with a private CHARMM executable; vrsn=, size=, and fort= args are ignored in this case, but comm=myr2k is needed for Myrinet based executables run via qcharmm. For both, the env var CHMEXE must be set; for qcharmm, it must be set in ~/.cshrc, because the job will be run in a new shell when started by the PBS queueing system. It can be set in a script such as the above run.csh for mpicharmm. Note that the executable must be compiled using the proper MPI include and library files in order to use the high-speed Myrinet communications.
March 2001
Problems related to the use of the Linux 2.4.0 kernel on new Biowulf nodes with 1 or 2 GB of RAM (m1024 or m2048) have lead to the following changes in the cover scripts:- qcharmm; new arg, mmry=mN, where N is oneof[ 256 512 1024 2048 ]; m256 is default, mmry=m512 is okay
- mpicharmm, runcharmm; jobs on 2.4.0 kernel nodes fallback to a slower g77 version of CHARMM (instead of default pgf77 version [incompatible])
- mpicharmm, runcharmm; jobs on 2.4.0 kernel nodes limited to 4 processors; no 'tcpfix' patch available
qsub -l nodes=4:m512 run.csh
April 2001
Kernel changes, a new CHARMM version (c27b4), and the final installation of Myrinet 2000 resulted in the following changes to the cover scripts and documentation:- Myrinet 2000; added support for comm=myr2k arg value for c27b3, c27b4
- qcharmm; no default node memory size for comm=myr, comm=myr2k
- large memory nodes upgraded to 2.4.2 kernel; PGI version now works, fallback to g77 version removed; 4 proc limit retained (still no tcpfix patch)
- documentation re-arranged, support grid added, syntax listings moved to just after command descriptions, private executable info moved to near end
- added fort= info to syntax listings
October 2001
Changes to cover scripts, available versions, and this documentation:- Added new CHARMM versions, c28b1 and c29a1
- Added support for use of single-threaded executable (non-MPI)
- Removed default memory request from qcharmm
- Dropped support for old Myrinet (hardware removed)
- Dropped c28a3 version (unstable)
June 2002
Changes to cover scripts, available versions, and this documentation:- Added new CHARMM versions, c28b2 and c29a2; dropped c29a1
- Added node= option to qcharmm
- Documented GRAPHX usage via "interactive batch"
October 2002
Changes to cover scripts, available versions, and this documentation:- Added new CHARMM versions, c29b1 and c30a1; dropped c29a2
- Added pointers to CHARMM .doc files on Biowulf server
August 2003
Changes to cover scripts, available versions, and this documentation:- Added new CHARMM versions, c30b1 and c31a1; dropped other alpha versions
- Dropped table from Supported Versions, defined more terms
April 2005
Changes to cover scripts, available versions, and this documentation:- Added new CHARMM versions, c31b1 and c31b2; dropped c31a1, c28b1, c29b1
- Updated command help output listing, typical /usr/local/charmm subdir listing
April 2006
- Added new CHARMM version, c32b2; Pathscale compiler added
- Added documentation for 64-bit support (c32b2)
- Added documentation for Infiniband; 64-bit, c32b2
This document prepared by Rick Venable, NHLBI/BBC Lab of Computational Biology