NERSC logo National Energy Research Scientific Computing Center
  A DOE Office of Science User Facility
  at Lawrence Berkeley National Laboratory
 
PackagePlatformVersionModule Docs
gaussian03 bassi c2 g03/c2  Vendor
gaussian03 bassi d2 g03/d2  Vendor
gaussian03 bassi e1 g03/e1  Vendor
gaussian03 jacquard c2 g03  Vendor
gaussian03 jacquard d2 g03/d2  Vendor
(*) Denotes limited support

Gaussian 03

Gaussian 03 is a connected series of programs for performing semi-empirical, density functional theory and ab initio molecular orbital calculations.

Setup and Access for Gaussian 03

The modules package controls access to software.

To use the default version of Gaussian 03, include the line:

% module load g03

in your .profile.ext or .login.ext files, or type this command whenever you want to access Gaussian for a single session.

Access Restrictions

Gaussian is available to the general user community at NERSC subject to the License To Use Agreement between U.C. Regents, Lawrence Berkeley National Lab and Gaussian Inc. This agreement restricts use of the Gaussian software in that NERSC may only "provide to third parties who are not directly or indirectly engaged in competition with Gaussian access to the binary code of the Software."

You must certify that this condition is met by using the g03_register command one time. A NERSC consulting trouble ticket will be generated and you will be provided access in a timely manner.

Running Gaussian 03

Gaussian can be run interactively using the command:

% g03 <input >output & 

Running in batch

To submit a job to the batch scheduler, you can use the llgauss command, or make a batch file as shown below:


llgauss [-j job_name] [-t time] [-p priority] [-n no. of procs.] [-m procs per node] [-o output] input 

where:

job_name
Job name for the batch system, defaults to name of the input file less the extension.
time
Maximum cpu time required, defaults to 3600 seconds.
priority
Priority to be used, defaults to regular. Valid options are low, regular, debug, or premium.
no. of procs.
Number of processors required. Recommend between 2 and 64 depending on the size of the system.
procs per node
Number of processors per node, defaults to 8 and 2 on bassi and jacquard, respectively. Recommend to use less than the default numbers if your job requests large memory.
output
Name of the output file, defaults to the job name with the extension .out
input
Name of the input file.

At this point, llgauss prompts for what to do next:

  • submit the job,
  • write out a shell script file which can be submitted later,
  • do nothing and quit.

Writing your own batch script

If you prefer to write your own batch script, rather than have llgauss generate one, the following are useful templates for a serial job. The first is for Loadleveler (bassi), the lower for PBS (jacquard).

#@job_name=t1
#@output=$(job_name).o$(jobid)
#@error=$(job_name).e$(jobid)
#@job_type=parallel
#@network.MPI=sn_all,shared,us
#@total_tasks=1
#@node=1
#@class=debug
#@wall_clock_limit=0:30:00
#@environment = COPY_ALL
#@queue

mkdir $SCRATCH/g03/$LOADL_STEP_ID
cd $SCRATCH/g03/$LOADL_STEP_ID
module load g03
g03 < $HOME/g_tests/T/t1.inx > $HOME/g_tests/T/t1.out
ls -l

#!/bin/ksh -l
#PBS -l nodes=1:ppn=1,walltime=06:00:00
#PBS -N t1 
#PBS -j oe
#PBS -q batch
#PBS -V

mkdir $SCRATCH/g03/$PBS_JOBID
cd $SCRATCH/g03/$PBS_JOBID
module load g03
g03 < $HOME/g_tests/T/t1.inx > $HOME/g_tests/T/t1.out
ls -l

For a parallel job, use the following forms:

#@job_name=t1
#@output=$(job_name).o$(jobid)
#@error=$(job_name).e$(jobid)
#@job_type=parallel
#@network.MPI=sn_all,shared,us
#@total_tasks=16
#@node=2
#@class=regular
#@wall_clock_limit=6:00:00
#@environment = COPY_ALL
#@queue

mkdir $SCRATCH/g03/$LOADL_STEP_ID
cd $SCRATCH/g03/$LOADL_STEP_ID
module load g03
g03l < $HOME/g_tests/T/t1.inx > $HOME/g_tests/T/t1.out
ls -l

#!/bin/ksh -l
#PBS -l nodes=16:ppn=2,walltime=06:00:00
#PBS -N t1 
#PBS -j oe
#PBS -q batch
#PBS -V

mkdir $SCRATCH/g03/$PBS_JOBID
cd $SCRATCH/g03/$PBS_JOBID
module load g03
g03l < $HOME/g_tests/T/t1.inx > $HOME/g_tests/T/t1.out
ls -l

NOTE

  • This relies on you having created a directory $SCRATCH/g03 sometime before you submit your first Gaussian job.
  • You should clean up the contents of $SCRATCH/g03 from time to time, when a Gaussian job crashes there will usually be temporary files left behind.
  • All files in $SCRATCH are subject to purging, so copy useful files to your $HOME space or HPSS.
  • For parallel jobs, you need to specify %nproclinda=nprocs in your input file, and start Gaussian with g03l.

An alternative strategy to giving explicit paths for input and output is to copy input to $SCRATCH at the beginning of the run and then copy output to permanent storage at the end of the job.

Files and Disk Usage

Temporary Files, Location and Naming

By default all unformatted files produced by Gaussian (e.g. checkpoint, read-write or integral files) will be written into the directory specified by the environment variable GAUSS_SCRDIR. For serial jobs, this is normally set to your temporary work directory, $SCRATCH on the IBM SP. For parallel jobs (started with g03l) it is set to your current directory.

You can save the checkpoint file by using the "link 0" command %chk. This can be useful if you want to retrieve geometries, density matrices etc., from an old run. These commands must appear before all other input, and are used to customize the environment in which the Gaussian program runs. For example:

%Chk=water
#RHF/6-31G

water energy

0 1
O
H 1 1.0
H 1 1.0 2 105.0

will create a checkpoint file named water.chk in the current directory.

The checkpoint file can be specified with a directory,

%Chk=/scratch/scratchdirs/jcarter/water

in which case the file is created with that absolute pathname.

Amount of Disk Space

Some types of calculation can dynamically change algorithm depending on the amount of scratch disk space available. This can be set with the MaxDisk keyword.

Memory

The link 0 directive %mem can be used to specify how much memory can be used; if this is absent a default of 8Mw is used (this is greater than the interactive limit). For example:

%Chk=c3f8
%mem=12mw
#MP2/6-311g(2d1f) freq

c3f8 mp2 frequency

0 1
...

The Gaussian executables take up about 6mw of memory. On the IBM SP, a batch job has exclusive use of a node. Most nodes have 16Gb of memory, with about 15.5GB usable by applications. Usually, 16 Gaussian tasks will run per node, so a memory limit of about 950MB is about as much as you can expect to run well.

For SCF and MP2 calculations involving high angular momentum functions, consult the following table for an estimate of basic memory requirements and add 4 times the number of basis functions squared.

Job Type Highest Angular Momentum
f g h
SCF Energy 8 mw 8 mw 18 mw
SCF Gradient 8 mw 10 mw 32 mw
SCF Frequency 8 mw 18 mw 54 mw
MP2 Energy 8 mw 10 mw 20 mw
MP2 Gradient 8 mw 12 mw 32 mw
MP2 Frequency 12 mw 20 mw 56 mw

For frequency calculations, the freqmem utility gives an estimate of the amount of memory needed to efficiently form the second derivatives. The form of the command is:

freqmem natoms nbasis r|u c|d functions

where natoms is the number of atoms; nbasis the number of basis functions; r or u indicates an rhf or uhf wavefunction; c or d indicates a conventional (disk based) or direct calculation; and functions is a string which list all angular momentum types in the basis set, e.g. spdf.

Multi-processor Jobs

The distributed-memory parallel version of Gaussian for the IBM SP uses the Linda software to coordinate and transfer data.

Before you can run Gaussian 03 in parallel, you must create a file called ".rhosts" in your home directory. The file should contain a single line:

+ username

where username is your username. The file should have no read/write/execute permissions set for group or other.

You should use the "%nproclinda=num" link 0 keyword to request the number of processors. This number should equal the number requested from Loadleveler. See Running in batch on the IBM SP for more details.

For example:

%Chk=c3f8
%mem=12mw
%nproclinda=16
#MP2/6-311g(2d1f) freq

c3f8 mp2 frequency

0 1
...

The following job types can be executed in parallel:

  • SCF and DFT energy, derivatives, and 2nd derivatives
  • MP2 energy, derivatives, 2nd derivatives
  • CI-Singles energy, derivatives, and 2nd derivatives
  • MC-SCF energy, derivatives, and 2nd derivatives
  • MP4, CCSD, CCSD(T) energy, and possibly gradient
  • 1-electron properties
  • Some 2nd order properties, e.g. NMR

Restart

Long running Gaussian jobs are vulnerable to machine crashes, this section outlines some tips on recovering as much of your intermediate data as possible.

Gaussian Checkpointing

Gaussian can restart the following calculations provided the checkpoint file is saved. See the Files section above on how to do this. If you have used the template script above, you should alter it to return to the temporary directory that contains the Gaussian files:

#@job_name=t1_restart
#@output=$(job_name).o$(jobid)
#@error=$(job_name).e$(jobid)
#@job_type=serial
#@class=regular
#@wall_clock_limit=6:00:00
#@queue

cd $SCRATCH/g03/<directory created during first run>
module load g03
g03 < $HOME/g_tests/T/t1_restart.inx > $HOME/g_tests/T/t1_restart.out
ls -l

Most often, the only input required in the restart input file is the the route specification. For example, if the original input is as follows:

%Chk=h2o2
%mem=12mw
#P HF/6-31g OPT

hooh hf optimization

0 1
...

The restart input should be:

%Chk=h2o2
%mem=12mw
#P HF/6-31g OPT

Exceptions to this are noted in the table below.

Job Type Action Required
SCF or DFT Single Point Energy Add the RESTART option to the SCF keyword. All other original input must be provided, e.g. atomic coordinates, basis set. The SCF will restart using the last set of orbitals.
Geometry Optimization Add the RESTART option to the OPT keyword. The calculation will restart at the last geometry.
G1, G2 and G2MP2 Add the RESTART option to the G1, G2 or G2MP2 keyword. The calculation will restart at the last job step.
CI Singles or Stability Add the RESTART option to the CIS or STABIL keyword, and the SCF keyword. All other original input must be provided, e.g. atomic coordinates, basis set. The calculation will restart using the last vectors from the Davidson diagonalization.
Intrinsic Reaction Coordinate Add the RESTART option to the IRC keyword.
Potential Energy Surface Scans Add the RESTART option to the SCAN keyword.
Frequency - numerical only Add the RESTART option to the FREQ keyword.
Polarizability - numerical only Add the RESTART option to the POLAR keyword.

Help

The Gaussian Inc. website has a detailed description of the electronic structure methods available in Gaussian, a set of example calculations and a list of frequently asked questions.

The Gaussian 03 Online Manual is particularly useful.

FAQ

  • The default memory for G03 is larger than the interactive node limit on the IBM SP. This means that you need to add a "%mem=32mb" line if you want to run the program interactively to test it. Production runs should be done in the batch.

  • Before you can run Gaussian 03 in parallel, you must create a file called ".rhosts" in your home directory. The file should contain a single line:

    + username
    

    where username is your username. The file should have no read/write/execute permissions set for group or other.

  • The binary files produced by Gaussian 03 are incompatible with files from versions G98 A9 and earlier of the program. This is because G03 uses 64 bit integers throughout the program and in the data files, whereas the older versions use 32 bit integers.

    Checkpoint files can be converted from earlier versions using the following procedure.

    % module load g98/a9
    % formchk test.chk test.fchk
    % module switch g98/a9 g03
    % unfchk test.fchk new.chk
    

    Assume the checkpoint file is named "test.chk". The G98 A9 formchk program generates a formatted checkpoint file names "test.fchk", then the G03 version of the unfchk program generates a binary checkpoint file named "new.chk". This new checkpoint file can be used in subsequent runs of the Gaussian program.

  • I see no speedup using Linda parallel Gaussian (g03l) using version C2 on bassi or jacquard.

    For Gaussian 03 C2 the default method of calculating the Coulomb contribution to the energy in an SCF or DFT calculation has changed. It now used the fast-multipole method (FMM). Currently this is not parallelized, but is considerably faster than the previous version for certain cases. To use the old default, include the keyword Int=NoFoFCou in your input file. Gaussian Inc. discusses this is the release notes for Gaussian 03.

Contact NERSC consultants with any questions regarding Gaussian 03. If necessary, problems will be forwarded to Gaussian Inc. for analysis.


LBNL Home
Page last modified: Sat, 09 Feb 2008 00:46:59 GMT
Page URL: http://www.nersc.gov/nusers/resources/software/apps/chemistry/g03/
Web contact: webmaster@nersc.gov
Computing questions: consult@nersc.gov

Privacy and Security Notice
DOE Office of Science