Office of Science
FAQ
Capabilities

MPP2 Details

Contents: Configuration - Access - File Systems - Environment - Compilers - Modules - MPI - Job Submission - NWChem jobs - Sample Script - Interactive jobs - Time Allocation Accounts - Job Policies - FAQ

Configuration

MPP2 is a 11.8 TFlops system that consists of 980 Hewlett-Packard Longs Peak nodes (of which 944 will be used for batch processing) with dual Intel 1.5 GHz Itanium-2 processors (also called Madison) and HP's zx1 chipset. The Madison processors are 64-bit processors with a theoretical peak of 6 GFlops. There are two types of nodes on the system, FatNodes (10 Gbytes of memory, i.e. 5 Gbytes per processor, and 430 Gbytes of local disk space) and ThinNodes (10 Gbytes of memory, i.e. 5 Gbytes per processor, and 10 Gbytes of local disk space). Fast interprocessor communication between the processors is obtained using a single rail QSNetII/Elan-4 interconnect from Quadrics. The system runs a version of Linux based on Red Hat Linux Advanced Server. A global 53 Tbyte Lustre file system is available to all the processors. Processor allocation is scheduled using the LSF resource manager.

Access [top]

Accessing MPP2 with SecurID®

For security reasons access to the Molecular Science Computing Facility is obtained through one-time passcodes using SecurID® cards.

The procedure for remote access to MPP2 is a two step process in which the user needs to log-on onto mpp2e.emsl.pnl.gov (an IA32 bit node) from which users can log onto MPP2 using their normal Username/Kerberos_password combination. The step-by-step procedure is presented below. You must have a SecurID® card from MSCF/PNNL and followed the initialization procedure before you try to log onto mpp2e.

More information on SecurID®

From Linux or Unix systems

{Note: Our machines use protocol 2, you may need to use ssh2 or ssh -2 for it to work}

  1. type the following at the window prompt: ssh <Username>@mpp2e.emsl.pnl.gov
  2. when prompted for the passcode, enter your PIN and SecurID® number
  3. once logged in on mpp2e, enter the command: ssh mpp2
  4. when prompted for your password, enter your Kerberos password for MPP2

From PC or Mac systems

You will need to use at least version 5.3 build 23 of SSH software from F-Secure (current version is 5.4 build 54). When connecting to MSCF's machines, The Authentication method must be set to "Keyboard Interactive". Or you can use PuTTY for Win32 platforms.

  1. Start F-Secure or PuTTY
  2. Set Host name to mpp2e.emsl.pnl.gov
  3. Set User Name to your Username on MPP2
  4. Set Authentication method to "Keyboard Interactive" (the default for PuTTY)
  5. Click on 'Connect' {F-Secure} or 'Open' {PuTTY}
  6. when prompted for the passcode, enter your PIN and SecurID® number
  7. once logged in on mpp2e, enter the command: ssh mpp2
  8. when prompted for your password, enter your Kerberos password for MPP2

File Systems [top]

There are four file systems available on the cluster:

Environment [top]

Software development and application requires a correct set of compilers, communication libraries and math libraries as well as tools that are not interchangeable with other pieces in the software development suite, and that are being regularly updated. In order to make software more supportable and environment setup more automatic, i.e. increasing the ease of use for the user community, we have adopted "modules" as a way to present packages of software that work together and to describe required dependencies among software packages.

Environment setup through modules [top]

The loading of a module environment will provide the user with the correct paths to commands, compilers, libraries, and will set up the necessary environment variables. The default module environment, which is considered to be integer*8 (i.e. -i8) is loaded at login time.

Various commands are available to probe your environment and to switch, add, or change (pieces of) the user environment:

Notes using modules

Compilers [top]

The primary compilers are Intel's ifort (for Fortran) and icc (for C) compilers. The current version 8.1 is installed on the system. The following compiler options and libraries will enhance the performance of the codes you compile:

Additional options can be found on the man pages, by typing "efc -help" or in the Intel Online Compiler Documentation. One additional option that might be important for users that transfer binary files between systems (to SGI's for example) is the environment variable that forces the code to read and write Big Endian binary files:

setenv F_UFMTENDIAN big

Intel's idb parallel debugger is available on the system. The GNU gdb debugger can be used to debug individual processes of a parallel program on each processor. In addition the TotalView debugger (find it at /home/scicons/apps/totalview.6.3.1-0) and the Vampir performance analyzer (find it at /home/scicons/apps/vampir) are available for debugging and performance enhancement purposes.

MPI [top]

The primary communication protocol for running parallel jobs is MPI. The MPI libraries, based on MPICH, have been implemented by Quadrics on top of the Elan3 (or Elan4) interconnect. There are a number of ways you can compile your parallel codes:

The environment variables MPI_INCLUDE and MPI_LIB gets set by the module environment to point to the appropriate location.

Job Submission [top]

Platform's LSF is a batch scheduler and resource manager used to submit and run jobs. Its commands are very similar to those of NQS and PBS. For example bsub, bjobs, and bkill work similar to the PBS commands qsub, qstat, and qdel. These three commands along with showq, rinfo and window are probably the only batch commands you will ever use on MPP2. The format of the job submission script will be discussed in the next section.

To submit a LSF jobfile:

To view the LSF queue:

Alternative and easier to read view of the LSF queue:

To remove a jobfile from the queue:

An overview of the processor status can be obtained from:

Check how many processors are available:

Submitting NWChem batch jobs [top]

When running NWChem calculations, users are encouraged to submit their jobs through the llnw script (available at /home/scicons/bin/llnw). This script will setup the jobs script, running environment, and makes sure the appropriate files get copied from and to your working directory.

Sample Script for Batch Jobs [top]

Here is a csh example of a LSF jobfile. The following example is a file for submitting a batch parallel job. Replace the items in red italic with your account information.

          #!/bin/csh
          #BSUB -P account
      
          #BSUB -n number of processors
          #BSUB -m type of processors
      
          #BSUB -W 4:00
          #BSUB -J jobname
      
          #BSUB -i input file
      
          #BSUB -e sample.err.%J
      
          #BSUB -o sample.out.%J
          #BSUB -u your_email@pnl.gov
      
          #BSUB -N
      
          #############################################################################
          # Copy files to /scratch (if necessary)
          #############################################################################
      
          foreach host ($LSB_HOSTS)
            rcp <your file> ${host}:/scratch/<your file>
      
          end
      
          #############################################################################
          # Run code (or multiple codes by repeating the prun command)
          #############################################################################
      
          prun -n <number of processors> your_program.x
      
          #############################################################################
          # Copy back important files to working_directory
          #############################################################################
      
          foreach host ($LSB_HOSTS)
            rcp ${host}:/scratch/<file to be copied>   <your working dir>/<file to be copied>
      
          end
          

The bsub options in the script above will be discussed briefly:

There are many more options can be specified. For those please read the man pages of the bsub command.

The prun command in the job script specifies the parallel run. The options are:

Alternative ways can be used to copy files to scratch disks. For example, if the same file gets needs to be copied to all nodes one could use the following line:

          pdsh -f 30 -w `nodes c $LSB_HOSTS` cp <your file> /scratch/<your file>
      
          

Running Interactive Jobs [top]

MPP2 has a pseudo-interactive queue of 32 processors available for software testing and debugging purposes. The run limit is 8 processors and the time limit in this queue is 30 minutes. To start an interactive job the following command should be used:

Except for -Is the variables are described above. By defining -Is you define that an interactive job should be started. The argument of -Is is the type of interactive shell you would like to have opened (i.e. csh or bash).
After obtaining the processors you can start a parallel job using the prun command

Note: this queue is a pseudo-interactive queue. Nodes are obtained from LSF in the same way batch jobs are. This means that there could be a delay in your interactive job actually starting (due to processors not being available or multiple people waiting for interactive nodes). In general interactive processors should be available in 30 minutes.

Time Allocation Accounts [top]

The Time Allocation Account needed to submit both batch and interactive jobs can best be seen as a bank account holding the CPU hours that have been allotted to the project you are involved in. The name of the account can be obtained from your project PI, or by typing qbalance -h. This command will show you the account name and the number of hours available on this account to you and the other users on the account. If no accounts are shown please contact the MSCF Consulting team. Some users are involved in multiple projects and have multiple account names to choose from. Please make sure you use the appropriate account for the job you are planning to submit. If you are not sure which account to use, please contact your PI.

Job Policies [top]

The primary objective of the MSCF is to provide teraflop computing resources for grand-challenge computational problems. The job scheduling policy has been established to provide a higher priority on effective throughput and turnaround of large jobs. To maximize system flexibility, all jobs are submitted to a single queue. The job scheduler controls the allocation of compute processors to the users job and will place the job in one of the four available queues, short, normal, large, and idle. This allocation is governed by a number of policy constraints. All policy constraint values must be satisfied. For more information on MSCF policies, please see User Policies.

Job Policy Constraints:

There is also a set of default values that limit the time a single job with a particular number of processors can have, which are shown below.

Number of Processors in a Single Job Time Limit Notes
512 - 1800 48 wall clock hours These jobs will be placed ahead of the jobs in the queues below, i.e. they will receive highest priority.
256-512 48 wall clock hours These jobs will be placed ahead of the jobs in the queues below, i.e. they will receive higher priority.
33 - 255 48 wall clock hours Normal priority jobs. Note that many of these jobs will backfill with the large jobs in the larger queues.
8 - 32 36 wall clock hours Normal priority jobs. Note that many of these jobs will backfill with the large jobs in the larger queues.
1-8 30 minutes Test / Interactive queue, the 32 processors in this queue are reserved on the ThinNodes only.

Idle queue:

The idle queue provides the opportunity for projects that have run out of their regular allocation to use processors that are idle on the machine. The primary purpose of this queue is to increase machine usage and help projects that have run out of their original allocation to get some computations done. The only limit on the "idle queue" is that jobs must be run for 90 minutes or less. Time used in the "idle queue" needs to be tracked in the GOLD accounting system and is designated as a "Charge Limit" for the project. Projects that qualify for the "idle queue" will have time assigned in the "CreditLimit" column of the "gbalance -h -u <UserID>" command. If your job will need to create a restart file, be sure it gets written before the 90 minute window terminates. To submit a job to the Idle queue, include the command: #BSUB -q "idle" and to see jobs in the "idle queue" you will need to use the -x flag for showq. For example: "showq -x | grep <UserID>" will find all of your jobs in all queues on MPP2. Send an e-mail to mscf-consulting@emsl.pnl.gov for more information or to answer any questions.

SIGHTS special purpose queue:

In addition to the queue limitations mentioned above the users can request access to a special purpose queue called Scientific Impact Generated by High Teraflop Simulations (SIGHTS). The SIGHTS queue is for compute jobs that require resources beyond the normal queue limits for MPP2, and serve uniquely impactful cutting-edge PNNL/EMSL mission science opportunities which cannot be performed at any other computing facility. SIGHTS jobs should require the use of 1024 processors or more, up to the capacity of MPP2. SIGHTS jobs are not automatically set in the MPP2 queue. SIGHTS jobs can be submitted anytime after approval and will be tended by an MSCF scientific consultant and operations personnel to assist in successful job completion . All requests for a week with a monthly outage will need to be submitted by 12 noon on Thursday before the scheduled outage.

Access to the SIGHTS is by request only and is subject to time availability. All requests are submitted to the MSCF consulting group for review. Please pick keyword "SIGHTS". In your request please provide a short (one-two page) description of what you plan to do and how you plan on doing it. Upon receipt of the request a consultant will be assigned to the job. The consultant will work with the users to be sure the job is ready. The consultants and operations staff will watch all SIGHTS jobs to be sure they are running correctly. Details about SIGHTS jobs are:

Short pool:

The short pool of 16 reserved ThinNode processors allows users to run small and short jobs to test or debug their codes. Interactive or test jobs will be limited to a maximum of 8 processors and a 30 minutes time limit per job. Note: the reserved processors in this pool are ThinNodes. Hence, if you request FatNodes in your jobs the job you will have to wait for FatNode processors to become available.

These constraints are used as system default values. If you require resources beyond these limits (more processor, longer run times), please have your Principle Investigator contact MSCF Computer Projects Manager and the appropriate user account can be configured with exceptions to over-ride the default values.

FAQ [top]

Below a list of frequently asked questions. This list will grow larger when more user questions arise.


We invite you to log in, exercise the system and report any problems/issues that you have with the machine.

For application software, hardware and/or system software questions/problems please contact the MSCF-consulting group through the web mscf-consulting form (internal users only) or send email to mscf-consulting@emsl.pnl.gov.