NERSCPowering Scientific Discovery Since 1974

Getting Started

Welcome to NERSC

Welcome to the National Energy Research Scientific Computing Center. This document will guide you through the basics of using NERSC's supercomputers, storage systems, and services. NERSC is a High Performance Scientific Computing Center and we're assuming  that you are familiar with the general concepts of parallel computing.

What is NERSC?

NERSC provides High Performance Computing and Storage facilities and support for research sponsored by, and of interest to, the  U.S. Department of Energy Office of Science. NERSC has the unique programmatic role of supporting all six Office of Science program offices: Advanced Scientific Computing Research, Basic Energy Sciences, Biological and Environmental Research, Fusion Energy Sciences, High Energy Physics, and Nuclear Physics. Scientists who have been awarded research funding by any of the offices are eligible to apply for an allocation of NERSC time.  Additional awards may be given to non-DOE funded project teams whose research is aligned with the Office of Science's mission. Allocations of time and storage are made by DOE.

NERSC has about 4,000 active user accounts from across the U.S. and internationally.

NERSC is a national center,  organizationally part of Lawrence Berkeley National Laboratory in Berkeley, CA. NERSC staff and facilities are located at Berkeley Lab's Oakland Scientific Facility in downtown Oakland, CA.

Computing & Storage Resources

As of May 2012, NERSC's major computing resources are

Hopper
A Cray XE6 with 153,216 compute cores, 217 TB of memory, 2PB of disk, and the Cray "Gemini" high-speed internal network. Hopper is NERSC flagship computer for running high-performance parallel scientific codes.
Carver
Carver is an IBM iDataPlex linux cluter with 3,200 compute cores, 11.5 TB of memory, and an InfiniBand internal network. Carver provides a generic full Linux environment for codes that don't demand massive parallelism or need operating system features that are not available on the Cray systems.

Major storage systems are

Global Scratch
The gloal scratch file system provides each user with a large storage space. Global scratch is temporary storage that is accessible from Hopper and Carver.
Local Scratch
Hopper also has local scratch file systems. The default user quota on Hopper is 5 TB.
Project
The project file system provides TBs of permanent storage, upon request, to groups of users who want to share data. Project is available from all NERSC compute systems.
HPSS Archival Storage
NERSC's archival storage system provides up to 59 PB of permanent, archival data storage.

NERSC also hosts a GPU testbed. To see which of these systems best fits your needs see Computational Systems and Working with Data.

How to Get Help

With an emphasis on enabling science and providing user-oriented systems and services, NERSC encourages you to ask lots of questions. There are lots of way to do just that.

Your primary resources are the NERSC web site and the HPC Consulting and Account Support staff. The consultants can be contacted by phone, email, or the web during working hours Pacific Time. NERSC's consultants are HPC experts and can answer just about all of your questions.

The NERSC Operations staff is available 24x7, seven days a week to give you status updates and reset your password. The NERSC web site is always available with a rich set of documentation, tutorials, and live status information.

Technical questions, account support, passwords, computer operations


1-800-666-3772 (or 1-510-486-8600)
Computer Operations = menu option 1 (24/7)
Account Support = menu option 2,  accounts@nersc.gov
HPC Consulting = menu option 3, or consult@nersc.gov
Online Help Desk = http://help.nersc.gov/

Computer operations (24x7) can reset your password and give you machine status information. Account Support and HPC Consulting are available 8-5 Pacific Time on business days.  See Contacting NERSC.

NERSC Web Site

You're already here so you know where to find NERSC's web site: www.nersc.gov. The web site has a trove of information about the NERSC center and how to use its systems and services. The "For Users" section is designed just for you and it's a good place to start looking around.

NERSC Accounts

In order to use the NERSC facilities you need:

  1. Access to an allocation of computational or storage resources as a member of a project account called a repository.
  2. A user account with an associated user login name (also called username).

If you are not a member of a project that already has a NERSC award, you may apply for an allocation. If you need to get a new user account that will be associated with an existing NERSC repository, the repository Principal Investigator or Manager will add you to the repository. For all the details see

PDSF accounts

Users needing to use the PDSF for High Energy Physics and Nuclear Science experiments may fill out the PDSF Account Request Form.

Passwords

Each real person has a single password associated with their login account.  This password is known by various names:  NERSC password, NIM password, and NERSC LDAP password are all commonly used.  As a new user, you must get your initial temporary password by talking to the NERSC Account Support office at 1-866-NERSC, menu option 2. You will then need to change that initial password at https://nim.nersc.gov. You should also answer the security questions; this will allow you to reset your password yourself should you forget it.  See Passwords.

Login Failures

If you fail to type your correct password three now five times in a row when accessing a NERSC system, your account on that system will be locked. You must call 1-866-NERSC 24x7 or send email to accounts@nersc.gov during business hours to get your account unlocked.

Accounting Web Interface (NIM)

You log into the NERSC Information Management (NIM) web site at https://nim.nersc.gov/ to manage your NERSC accounts. In NIM you can check your daily allocation balances, change your password, run reports, update your contact information, change your login shell, etc.  See NIM Web Portal.

Connecting to NERSC

In order to login to NERSC computational systems, you must use the SSH protocol.  This is provided by the "ssh" command on Unix-like systems (including Mac OSX) or by using an SSH-compatible application (e.g. PuTTY on Microsoft Windows). Login in with your NERSC username and password. Your can use tools based on certificate authentication (e.g. gridftp); please ask the NERSC consultants for details.

We recommend that you "forward" X11 connections when initiating an SSH session to NERSC.  For example, when using the ssh command on Unix-based systems, provide the "-Y" option.

In the following example, a user logs in to Hopper, with NERSC username "elvis", and requests X11 forwarding:

myhost% ssh -Y elvis@hopper.nersc.gov
Password: enter NIM password for user elvis
Last login: Tue May 15 11:32:10 2012 from 128.55.16.121

---------------------------- Contact Information ------------------------------
NERSC Contacts                http://www.nersc.gov/about/contact-us/
NERSC Status                  http://www.nersc.gov/users/live-status/
NERSC: 800-66-NERSC (USA)     510-486-8600 (outside continental USA)

------------------- Systems Status as of 2012-05-15 12:39 PDT ------------------
Carver:      System available.
Dirac:       System available.
Euclid:      System available.
Genepool:    System available.
Hopper:      System available.
HPSS Backup: System available.
HPSS User:   System available.
NGF:         System available.
PDSF:        System available.

------------------- Service Status as of 2012-05-15 12:39 PDT ------------------
All services available.

-------------------------------- Planned Outages -------------------------------
HPSS User:   05/16/12 09:00-13:00PT, scheduled maintenance.

License Servers:  05/16/12 10:30-12:30PT, scheduled maintenance.
                    Hopper compilers and debuggers will be unavailable
                    during this time period.

--------------------------------- Past Outages ---------------------------------
Euclid:      05/14/12 21:10-21:31PT, unscheduled maintenance.
--------------------------------------------------------------------------------
hopper12 e/elvis>

Software

NERSC and each system's vendor supply a rich set of HPC utilities, applications, and programming libraries. If there is something missing that you want, send email to consult@nersc.gov with your request and evaluate it for appropriateness, cost, effort, and benfit to the community.

For a list of available software, see NERSC Software. Popular applications include VASP and Gaussian; libraries include PETSc and HDF5.

More information about how you use software is included in the next section.

Computing Environment

When you log in to any NERSC computer (not HPSS), you are in your global $HOME directory. You initially land in the same place no matter what machine you connect to: Hopper, Carver, Euclid - their home directories are all the same. This means that if you have files or binary executables that are specific to a certain system, you need to manage their location. Many people make subdirectories for each system in their home directory. Here is a listing of my home directory.

hopper12% ls
bassi/   datatran/  hopper/    silence/   turing/
bin/     davinci/   jacquard/  software@  web@
carver/  franklin/  project@   tesla/     www@
common/  grace/     rohan/     training@  zwicky/

Customizing Your Environment

The way you interact with the NERSC computers can be controlled via certain startup scripts that run when you log in and at other times.  You can customize some of these scripts, which are called "dot files," by setting environment variables and aliases in them. 

There are several "standard" dot-files that are symbolic links to read-only files that NERSC controls. Thus, you should NEVER modify or try to modify such files as .bash_profile, .bashrc, .cshrc, .kshrc, .login, .profile, .tcshrc, or .zprofile.  Instead, you should put your customizations into files that have a ".ext" suffix, such as .bashrc.ext, .cshrc.ext, .kshrc.ext, .login.ext, .profile.ext, .tcshrc.ext, .zprofile.ext, and .zshrc.ext.   Which of those you modify depends on your choice of shell, although note that NERSC recommends the csh. 

The table below contains examples of basic customizations.  Note that when making changes such as these it's always a good idea to have two terminal sessions active on the machine so that you can back out changes if needed!

Customizing Your Dot Files
bash csh
export ENVAR=value setenv ENVAR value
export PATH=$PATH:/new/path set PATH = ( $PATH /new/path)
alias ll='ls -lrt’ alias ll “ls –lrt”

Note, too, that you may want certain customizations for just one NERSC platform and not others, but your "dot" files are the same on all NERSC platforms and are executed upon login for all.  The solution to this problem is to test the value of a preset environment variable $NERSC_HOST, as follows:

if ($NERSC_HOST == "euclid") then
setenv FC ifort
endif

If you accidentally delete the symbolic links to the standard dot-files or otherwise damage your dot-files to the point that it becomes difficult to do anything you can recover the original dot-file configuration by running the NERSC command fixdots. 

Modules

Easy access to software is controlled by the modules utility. With modules, you can easily manipulate your computing environment to use applications and programming libraries. In many case, you can ignore modules because NERSC has already loaded a rich set of module for you when you first log in. If you want to change that environment you "load," "unload," and "swap" modules. A small set of module commands can do most of what you'll want to do.

module list

The first command of interest is "module list", which will show you your currently loaded modules. When you first log in, you have a number of module loeaded for you. Here is an example from Franklin.

hopper12% module list

Currently Loaded Modulefiles:
1) modules/3.2.6.6 9) gni-headers/2.1-1.0400.4156.6.1.gem 17) xt-shmem/5.4.4
2) xtpe-network-gemini 10) xpmem/0.1-2.0400.30792.5.6.gem 18) xt-mpich2/5.4.4
3) pgi/12.2.0 11) xe-sysroot/4.0.36 19) torque/2.5.9
4) xt-libsci/11.0.06 12) xt-asyncpe/5.08 20) moab/6.1.5
5) udreg/2.3.1-1.0400.3911.5.13.gem 13) atp/1.4.2
6) ugni/2.3-1.0400.4127.5.20.gem 14) PrgEnv-pgi/4.0.36
7) pmi/3.0.0-1.0000.8661.28.2807.gem 15) eswrap/1.0.10
8) dmapp/3.2.1-1.0400.3965.10.63.gem 16) xtpe-mc12

You don't have to be concerned with most of these most of the time. The most important one to you is called "PrgEnv-pgi", which let you know that the environment is set up to use the Portland Group compiler suite.

module avail

Let's say you want to use a different compiler. The "module avail" command will list all the available modules. It's a very long list, so I won't list it here. But you can use the module's name stem to do a useful search. For example

hopper12% module avail PrgEnv

--------------------------- /opt/modulefiles -------------------------------

PrgEnv-cray/3.1.61               PrgEnv-intel/4.0.36(default)
PrgEnv-cray/4.0.30               PrgEnv-intel/4.0.46
PrgEnv-cray/4.0.36(default)      PrgEnv-pathscale/3.1.61
PrgEnv-cray/4.0.46               PrgEnv-pathscale/4.0.30
PrgEnv-gnu/3.1.61                PrgEnv-pathscale/4.0.36(default)
PrgEnv-gnu/4.0.30                PrgEnv-pathscale/4.0.46
PrgEnv-gnu/4.0.36(default)       PrgEnv-pgi/3.1.61
PrgEnv-gnu/4.0.46                PrgEnv-pgi/4.0.30
PrgEnv-intel/3.1.61              PrgEnv-pgi/4.0.36(default)
PrgEnv-intel/4.0.30              PrgEnv-pgi/4.0.46

Here you see that five programming environments are available using the Cray, GNU, Intel, Pathscale, and PGI compilers. (The word "default" is confusing here; it does not refer to the default computing enviroment, but rather the default version of each specific computing environment.)

module swap

Let's say I want to use the Cray compiler instead of PGI. Here's how to make the change

hopper12% module swap PrgEnv-pgi PrgEnv-cray
hopper12%

Now you are using the Cray compiler suite. That's all you have to do. You don't have to change your makefiles, or anything else in your build script unless they contain PGI or Cray-specific options or features. Note that modules doesn't give you any feedback about whether the swap command did what you wanted it to do, so always double-check your environment using the ";module list" command.

module load

There is plenty of software that is not loaded by default. You can consult the NERSC web pages to see a list, or you can use the "module avail" command ot see what module are available ("module list" output can be a bit cryptic, so check the web site if you are in doubt about a name).

For example, if you want to use the NAMD molecular dynamics application. Try "module avail namd"

nid00163% module avail namd

------------------ /usr/common/usg/Modules/modulefiles -----------------------------
namd/2.7 namd/2.8(default) namd/2.8b1 namd/cvs namd_ccm/2.8(default)

The default version is 2.8, but say you'd rather use some features available only in version 2.8b1. In that case, just load that module.

hopper12% module load namd/2.8b1
hopper12%

Now you can invoke NAMD with the "namd2" command (that's the name of the NAMD binary) in the proper way (see Running Jobs below).

If you want to use the default version, you can type either "module load namd" or "module load namd/2.8", either will work. (The word "default" is not part of the name.)

Compiling Code

Let's assume that we're compling code that will run as a parallel application using MPI for internode communication and the code is written in Fortran, C, or C++. In this case, it's easy because you will use standard compiler wrapper script that bring in all the include file and library paths and set linker options that you'll need.

On the Cray systems you should use the following wrappers:  ftn, cc, or CC

On the Carver system you should use the following wrappers: mpif90, mpicc, or mpiCC

for Fortran, C, and C++, respectively. 

Parallel Compilers
Platform Fortran C C++
Cray ftn cc CC
Others mpif90 mpicc mpiCC

Here's a "Hello World" program to illustrate.

hopper12% cat hello.f90 
program hello

        implicit none

        include "mpif.h"

        integer:: myRank
        integer:: ierror

        call mpi_init(ierror)

        call mpi_comm_rank(MPI_COMM_WORLD,myRank)

        print *, "MPI Rank ",myRank," checking in!"

        call mpi_finalize(ierror)

end program hello

To compile on Hopper (a Cray), use

hopper12% ftn -o hello.x hello.f90 
hopper12%

That's all there is to it. No need to put thing like -I/path/to/mpi/include/files or -L/path/to/mpi/libraries on the compile line. The "ftn" wrapper does it all for you. (For fun, add a -v flag to the compile line to see all the things you'd have to specify by hand if the wrappers weren't there to help. You don't want to do that! In addition, when system software is updated, you don't have to change your compile line to point to new directories.)

Using Programming Libraries

Cray

If you want to use a programming library, all you have to do on the Crays is load the appropriate module. Let's compile an example code that uses the HDF5 I/O library. (The code is HDF5 Example.) First let's try it in the default environment.

hopper12% cc -o hd_copy.x hd_copy.c
PGC-F-0206-Can't find include file hdf5.h (hd_copy.c: 39)

The example code include the line

#include "hdf5.h"

and compiler doesn't know where it is. Now let's load the hdf5 module and try again.

hopper12% module load hdf5
hopper12% cc -o hd_copy.x hd_copy.c
hopper12%

We're all done and ready to run the program! No need to manually add the path to HDF5; it's all taken care of by the scripts.

Non-Cray

On NERSC machines that are not Crays, you have to do a little more work to use libraries. But not too much. Let's try the same thing we did above for Cray.

cvrsvc04% mpicc -o hd_copy.x hd_copy.c
PGC-F-0206-Can't find include file hdf5.h (hd_copy.c: 39)
PGC/x86-64 Linux 10.8-0: compilation aborted
cvrsvc04% module load hdf5
cvrsvc04% mpicc -o hd_copy.x hd_copy.c
PGC-F-0206-Can't find include file hdf5.h (hd_copy.c: 39)
PGC/x86-64 Linux 10.8-0: compilation aborted

Even with the module loaded, the compiler doesn't know where to find the HDF files. The mpcc wrapper script doesn't contain quite as many features as the Cray cc wrapper. So, you need to see how to fix this. One option is to check the NERSC web site for instructions. Another way to try to figure it out for youself is to look under the covers in the HDF5 module.

cvrsvc04% module show hdf5
-------------------------------------------------------------------
/usr/common/usg/Modules/modulefiles/hdf5/1.8.3:

conflict         hdf5-parallel
module           load szip
module           load zlib
setenv           HDF5_DIR /usr/common/usg/hdf5/1.8.3/serial
setenv           HDF5 -L/usr/common/usg/hdf5/1.8.3/serial/lib ...
setenv           HDF5_INCLUDE -I/usr/common/usg/hdf5/1.8.3/serial/include  
prepend-path     PATH /usr/common/usg/hdf5/1.8.3/serial/bin
prepend-path     LD_LIBRARY_PATH /usr/common/usg/hdf5/1.8.3/serial/lib
-------------------------------------------------------------------

The "module show" command reveals (most of) what the module actually does when you load it. You can see that it defines some environment variables you can use, for example HDF5_INCLUDE, which you can use in your build script or Makefile. Look at the definition of the HDF5 environment variable. It contains all the include and link options in one variable. Let's try using it.

cvrsvc04% mpicc -o hd_copy.x hd_copy.c $HDF5
cvrsvc04%

That worked, so we're done compiling.

Running Jobs

High performance parallel computing codes generally run in "batch" mode at NERSC. Batch jobs are controlled by scripts written by the user and submitted to a batch system that manages the compute resource and schedules the job to run based on a set of policies. In general, NERSC batch systems work on a first-in, first-out basis, but this is subject to a number of constraints.

Batch Jobs

Batch scripts consist of two parts: 1) a set of directives that describe your resource requirements (time, number of processors, etc.) and 2) UNIX commands that perform your computations. These UNIX command may create directories, transfer files, etc.; anything you can type at a UNIX shell prompt.

The actual execution of your parallel job, however, is handled by a special command, called a job launcher. In a generic Linux environment this utility is often called "mpirun". On Cray systems, the utility is named "aprun." For details see

Interactive Parallel Jobs

Development, debugging, and testing of parallel code demands interactivity. At NERSC you can run parallel jobs interactively, subject to a 30-minute time limit and limit on the number of nodes. Instead of submitted a batch job script, you tell the batch system that you want interactive access and then run your command at the command prompt.

On Hopper
% qsub -V -I -lmppwidth=<number of cores> 
% cd $PBS_O_WORKDIR
% aprun -n <number of tasks> <name_of_executable_binary>
On Carver
% qsub -V -I -lnodes=<number of nodes>
% cd $PBS_O_WORKDIR
% mpirun -n <number_of_tasks> <name_of_executable_binary>

Please note that a parallel application will fail to run unless you have first set up the parallel enviroment with the qsub command.

Interactive Serial Jobs

Sometimes you need to run serial programs and utilities, e.g. to perform a brief analysis or transfer files. The nodes that you log into present you with a more-or-less standard Linux enviroment. You can perform fuctions you would typically do in such an enviroment, but recall that you are sharing this resource with many users, and excessive CPU usage will require NERSC to ask you to move your computations elsewhere or kill your processes if necessary. NERSC provides a platform, named Euclid, on which you can run serial applications and dedicated data transfer nodes.

See:

Transfering Data

We provide several ways for transferring data both inside and outside NERSC. To transfer files from/to NERSC, we suggest using the dedicated Data Transfer Nodes, which are optimized for bandwidth and have access for most of the NERSC file systems.

Tools for data transfer include:

  • SCP/SFTP: for smaller files (<1GB).
  • Globus Online: for large files, with features for auto-tuning and auto-fault recovery without a client install
  • BaBar Copy (bbcp): for large files
  • GridFTP: for large files
  • HSI: can be an efficient way to transfer files already in the HPSS system

For more detailed information on data transfer, see Transferring Data.