Biowulf at the NIH
Programming Tools and Libraries

Biowulf is intended to run code programmed by our users as well as commercial and open-source codes that may need to be built for our platform(s) if they do not come in a useable binary format. Accordingly, we host a number of compilers and build environments to suit the needs of developers and individuals that need to build projects from source.

This page provides information specific to the Biowulf development environment as well as a rough overview of the various compilers, libraries and programs used on our system. The linked documentation on specific packages and programs will usually need to be consulted for any useful understanding of them.

Notes
top

64-bit vs 32-bit

The Biowulf head-node is a 32-bit system that can only be used to compile 32-bit applications. In most cases this is the preferred mode since 32-bit applications will run on any node in the cluster and applications with sufficiently undemanding memory and integer/floating-point requirements will often perform better in a 32-bit mode. However some users may require or benefit from the larger per-process available memory possible in a 64-bit architecture (>4GB) or benefit from the additional registers and instruction extensions present on 64-bit Intel/AMD processors − only the developer, familiar with his/her project can ascertain this.

A large majority of the cluster nodes are 64-bit systems, however some use 32-bit Intel Xeon processors. In the event that a user builds a 64-bit application s/he will need to specify (at least) "x86-64" in their qsub options to keep the job from dieing on a 32-bit node.

If a 64-bit application is required or beneficial, the user will need to gain a session on one of the 64-bit interactive nodes to compile and test. This is done by logging into Biowulf and requesting an interactive job from the scheduler:

% qsub -I -l nodes=1

The above example will give the user an interactive session on a system with a 64-bit development toolchain.


Firebolt

In addition to the standard cluster nodes, we maintain a 32-processor Itanium2 system (Firebolt) with 96GB of RAM intended for large memory or floating-point intensive tasks. This system has the usual gcc compiler suite available as well as the Intel compiler suite. Complete information on Firebolt's hardware and software configuration is available here.


Compiler Suites
top

All Biowulf cluster nodes include the GCC compiler suite which includes C, C++, FORTRAN77 and FORTRAN90/95 compilers (gcc, g++, g77 and gfortran respectively) along with the GNU debugger (gdb). In addition to these default compilers, there are three other popular suites available to Helix/Biowulf users that may improve the performance of your project or better accommodate certain code bases - these are the Intel, Pathscale and Portland Group International (PGI) compiler suites.

Each compiler suite has a listing which includes a chart that shows the location of a set-up script that will enable the compiler in your environment, lists common front-ends for each compiler and shows the locations of various MPI installations depending on target architecture and desired interconnect (see MPI section below for details on MPI installations). For instance, if you wanted to use the Intel compilers to build your project, and your current shell is bash, the following command would set up your environment:

% source /usr/local/intel/intelvars.sh
    Arch is i386.
    setting up for Intel C compiler version 10.1.018.
    setting up for Intel Fortran compiler version 10.1.018.
    setting up for Intel debugger verion 10.1.018.

GNU Compilers

The venerable GNU compiler suite is available in the user's PATH by default. Though not considered "high-performance" or "optimized," they are usually the best choice for pre-existing source codes since build-systems are often created with this compiler in mind. Consequently, sensible compiler flags are generated and building will be comparatively trouble-free with these build-systems. However, if performance is an issue, you should consult the documentation distributed with the source distribution that you're trying to build to see if other compilers are supported. If you're developing a high-performance application "in-house," you may want to explore the other compilers available on Biowulf.

GCC quick-chart
Current Version: 4.1.2
Documentation: Try "man gcc" or "man g++" or "man g77" or "man gfortran".
Primary front-ends:
C gcc
C++ g++
Fortran77 g77
Fortran90/95 gfortran
MPI Installations:
Ethernet (32-bit): /usr/local/mpich2
Ethernet (64-bit): /usr/local/mpich2-gnu64
Myrinet (32-bit): /usr/local/mpich-gm2k

Portland Group International (PGI Compilers)

The Portland Group suite includes the usual set of C, C++ FORTRAN77 and FORTRAN90/95 compilers. Also included is an OpenMP implementation, preliminary support for FORTRAN2000 and PGDBG, a graphical debugger (see debugging section below).

PGI quick-chart
Current Version: 8.0-1
Setup Scripts
bash /usr/local/pgi/pgivars.sh
csh/tcsh /usr/local/pgi/pgivars.csh
Documentation: PGI Compiler Documentation
Primary front-ends:
C pgcc
C++ pgCC
Fortran77 pgf77
Fortran90 pgf90
Fortran95 pgf95
MPI Installations:
Ethernet (32-bit): /usr/local/mpich2-pgi
Ethernet (64-bit): /usr/local/mpich2-pgi64
Myrinet (32-bit): /usr/local/mpich-gm2k-pg

Intel Compilers

The Intel suite includes C, C+, FORTRAN77 and FORTRAN90/95 compilers along with OpenMP and the Intel debugger. Anecdotal evidence suggests that this compiler suite frequently provides the best performance for calculation-intensive applications. Included with these compilers are the Intel Math Kernel Library (MKL), LINPACK and Intel Performance Primitives (IPP) - all discussed in the scientific libraries section below.

Intel quick-chart
Current Version: 10.1.018
Setup Scripts
bash /usr/local/intel/intelvars.sh
csh/tcsh /usr/local/intel/intelvars.csh
Documentation: C/C++
Fortran
Debugger
LINPACK
Math Kernel Library
Intel Performance Primitives
Primary front-ends:
C icc
C++ icpc
Fortran77/90/95 ifort
MPI Installations:
Ethernet (32-bit): /usr/local/mpich2-intel
Ethernet (64-bit): /usr/local/mpich2-intel64
Myrinet (32-bit): /usr/local/mpich-gm2k-i

Pathscale Compilers

The Pathscale suite is often used to generate highly optimized binaries on Opteron systems. It includes a debugger and a option generator that can analyze source code and suggest compiler options. This is the only compiler that can be used to build MPI binaries for Biowulf's Infinipath network.

Pathscale quick-chart
Current Version: 3.1
Setup Scripts
bash /usr/local/pathscale/pathvars.sh
csh/tcsh /usr/local/pathscale/pathvars.csh
Primary front-ends:
C pathcc
C++ pathCC
Fortran77 pathf90 (not pathf77)
Fortran90 pathf90
Fortran95 pathf95
MPI Installations:
Ethernet (32-bit): /usr/local/mpich2-pathscale
Ethernet (64-bit): /usr/local/mpich2-pathscale64
Myrinet (32-bit): /usr/local/mpich-gm2k-ps
Infinipath: MPI wrappers in /usr/bin on the Infinipath nodes will use the Pathscale compilers and link the correct (Infinipath) libraries by default.

Documentation:


Java
top

Several Java Development Kits are installed in /usr/local/java. Older versions are available for applications that require them. The latest is usually the best choice. For very specific situations, 64-bit JDKs are available in /usr/local/java64. These are not recommended when the 32-bit JDKs will do, but are available here for completeness and for users that know that they need a 64-bit Java.

Scripting Languages
top

While not usually appropriate for high-performance calculations or distributed memory tasks, scripting languages can be very useful when managing jobs or processes at a higher level, sorting data or doing an infinite number of simple tasks. Biowulf includes many scripting languages which are made available by the operating system and by the Biowulf staff.

Among these, Perl is arguably the most useful. The Perl installation available in the default PATH on all nodes includes an extensive set of modules and extensions built, created or made available by the Biowulf staff. Other scripting environments include Python, Ruby, PHP, tcl, a host of small, special-purpose languages and of course, there's your shell.


Parallel jobs and MPI (Message Passing Interface)
top

Parallel, distributed-memory applications on Biowulf usually use MPI for inter-process communication. It is an application programming interface standard that currently exists in two major versions: MPI1 and MPI2. Both standards are supported on the Ethernet cluster via MPICH2, a popular MPI implimentation. The Infinipath and Myrinet2000 portions of Biowulf support only the MPI1 standard via vendor-supplied MPI implimentations. The Infiniband cluster uses MVAPICH2, a complete MPI1 and MPI2 implimentation. You can read more about the MPI standard here. The Helix staff has prepared and made available relivent MPI installations for our supported networks.

MPI for Ethernet
NOTE:
This section pertains mainly to new development as the Biowulf staff has moved from MPICH1 to MPICH2 as the currently maintained MPI implementation for Ethernet. MPICH1 is still supported - users with existing binaries/build systems can consult the MPICH1 page to reference Biowulf's site-specific information on MPICH1.

MPI over Ethernet on the Biowulf cluster is primarily provided by MPICH2, an implementation developed at Argonne National Laboratories. To use this MPI, the user will first have to decide on a compiler and target architecture and then consult the chart below to find the correct MPICH installation.

MPICH installations for Ethernet
Compiler Architecture MPI installation
GCC i686 (32-bit)
/usr/local/mpich2
  x86_64 (64-bit)
/usr/local/mpich2-gnu64
PGI i686
/usr/local/mpich2-pgi
  x86_64
/usr/local/mpich2-pgi64
Intel i686
/usr/local/mpich2-intel
  x86_64
/usr/local/mpich2-intel64
Pathscale i686
/usr/local/mpich2-pathscale
  x86_64
/usr/local/mpich2-pathscale64

Documentation: MPICH2 Documentation download page

As an example, here were are preparing to build an MPI project that uses Ethernet as its MPI network and we want to use the Intel compilers to create 64-bit binaries. Note that when building a 64-bit application, the user will need to first gain an interactive session on a 64-bit node - our default interactive nodes are all 64-bit (for 32-bit builds, stay on the Biowulf head-node).

Here we get an interactive session, source the appropriate environment set-up script so that the compilers referenced in the MPI wrappers are available, and then add the MPI bin directory to the begining of our PATH:

% qsub -I -l nodes=1
qsub: waiting for job 1579075.biobos to start
qsub: job 1579075.biobos ready

[janeuser@p2 ~]$ source /usr/local/intel/intelvars.sh
    Arch is x86_64.
    setting up for Intel C compiler version 10.1.018.
    setting up for Intel Fortran compiler version 10.1.018.
    setting up for Intel debugger verion 10.1.018.
[janeuser@p2 ~]$ export PATH=/usr/local/mpich2-intel64/bin:$PATH

Now we can use the MPI wrappers in our PATH to build MPI programs (see the MPICH2 documentation page for information on the MPI wrappers and how to use them):

[janeuser@p2 ~]$ mpif90 -o mpitest hello_world.f90

For complete documentation on using MPICH2, consult the latest version of the MPICH2 user's guide here (Argonne National Laboratory's MPICH2 site).

Running Ethernet MPI Applications

MPI for Myrinet 2000

A portion of the Biowulf cluster has access to a Myrinet2000 network (also called GM2k), which has some performance advantages over Ethernet: lower latency and roughly twice the bandwidth of gigabit Ethernet. Currently, only 32-bit applications are supported on the GM2k network. This chart shows the MPI installations used on Biowulf for building MPI applications that will run on the Myrinet network.

MPICH installations for Myrinet 2000 (GM2k)
Compiler MPI installation
GCC
/usr/local/mpich-gm2k
PGI
/usr/local/mpich-gm2k-pg
Intel
/usr/local/mpich-gm2k-i
Pathscale
/usr/local/mpich-gm2k-ps

The process for building GM2k applications is similar to the Ethernet MPI build process. Fist, select and source a compiler set-up script, then set your PATH to include the bin directory from the appropriate GM2k installation, then use the wrappers in your PATH to build the project.

% source /usr/local/pgi/pgivars.sh
    Arch is i386
    PGI Server Suite version: 7.2-3
% export PATH=/usr/lcoal/mpich-gm2k-pg/bin

Now we can use the MPI wrappers in our PATH (mpicc, mpicxx, mpif90 etc.) to build MPI programs for Myrinet.

Running Myrinet2000 Jobs

MPI over Infinipath

A portion of the Biowulf cluster has access to an Infinipath network (Infinipath being a subset of the Infiniband standard optimized for message passing) that is intended for high-performance parallel applications.

Building and running MPI applications for Infinipath requires the user to log into an Infinipath node, source the Pathscale compiler set-up script before using the MPI wrapper scripts that are in the default PATH on the Infinipath nodes.

Note: only the Pathscale compiler suite should be used to build binaries for the Infinipath cluster as the MPI libraries for that interconnect seem to have stability problems when combined with the run-times of other compilers (your milage may vary).

% qsub -I -l nodes=1:ipath
qsub: job 1580340.biobos ready

% source /usr/local/pathscale/pathvars.sh
Setting Pathscale compiler version 3.1
Architecture is x86_64
% mpicc -o mympiprog mympisourcefile.c

Infinipath MPI Wrappers
Language/Compiler MPI wrapper
C (pathcc)
/usr/bin/mpicc
C++ (pathCC)
/usr/bin/mpicxx
Fortran77 (pathf90)
/usr/bin/mpif77
Fortran90 (pathf90)
/usr/bin/mpif90

Documentation: The Infinipath MPI libraries and wrapper scripts were derived from the MPICH1 MPI implimentation. Consult the MPICH1 pages for specific documentation on using these wrapper scripts keeping in mind that the target interconnet is an Infinipath network and not Ethernet. The above listed wrapper scripts themselves contain much useful information in the comment areas.

Running Infinipath Jobs

MPI over Infiniband

A portion of the Biowulf cluster has access to an Infiniband network, which is currently the fastest network on this cluster in terms of bandwidth. Building MPI applications for Infiniband requires the user to log into an Infiniband node and use the MPI wrappers found in the default PATH. Currenty, GCC is the only directly supported compiler back-end for these scripts. The Biowulf staff can build an Infiniband MPI for the other available compilers (Intel, PGI, Pathscale) upon request.

% qsub -I -l nodes=1:ib
qsub: job 1580340.biobos ready

% mpicc -o mympiprog mympisourcefile.c

Infiniband MPI Wrappers
Language/Compiler MPI wrapper
C (gcc)
/usr/bin/mpicc
C++ (g++)
/usr/bin/mpicxx
Fortran77 (g77)
/usr/bin/mpif77
Fortran90 (gfortran)
/usr/bin/mpif90

Documentation: The default MPI implimentation on the Infiniband cluster is MVAPICH2. Documentation for MVAPICH2 can be found here.

Running Infiniband Jobs

Scientific Libraries
top

Listed here are a few of the more notable libraries and suites available to developers of scientific and/or high-performance software. These are mostly various implementations of BLAS, LAPACK, etc., however the developer should review each to find the one that best fits his/her needs.

FFTW

FFTW is an popular open-source fast Fourier transform library. 32- and 64-bit versions of the library can be found here:

/usr/local/fftw-2.1.5/
/usr/local/fftw-2.1.5_x86_64/
Intel Math Kernel Library (MKL)

Then Intel Math Kernel Library (MKL) is a set of optimized and threadable math routines for scientific, engineering and financial applications. It includes BLAS, LAPACK, ScaLAPACK, FFTs, a vector math library and random number generators. It is installed on biowulf in:

/usr/local/intel/mkl/current

Intel MKL Documentation

AMD Core Math Library (ACML)

AMD's implementation of several common math routine libraries: Full 1, 2 and 3 BLAS, LAPACK, FFTs and a number of routine sets specific to the ACML. These should run exceptionally well on Biowulf's Opteron nodes. AMD was kind enough to build the ACML for the major Fortran compiler suites. Installations are available here according to compiler:

Compiler ACML installation
GNU (gfortran)
/usr/local/acml/gfortran
Intel (ifort)
/usr/local/acml/ifort
PGI (pgf77/90/95)
/usr/local/acml/pgi
Pathscale (pathf90/95)
/usr/local/acml/pathscale

ACML Documentation

Intel Integrated Performance Primitives (IPP)

From Intel's website:

Integrated Performance Primitives (Intel® IPP) is an extensive library of multi-core-ready, highly optimized software functions for multimedia data processing, and communications applications.

It includes, among other things, routines for audio/video encoding and decoding, image processing, signal processing, Vector/Matrix math and data compression. IPP is installed here according to target architecture:

/usr/local/intel/ipp/current

Documentation:
GNU Scientific Library (GSL)

This is the open-source scientific library provided by GNU for C and C++ developers. It provides a large array of math routines with an extensive test-suite. A complete list of routines and capabilities is available on the GSL website. It is installed by architecture:

/usr/local/gsl-x86_64
/usr/local/gsl-i686

GSL Documentation

Debuggers
top

Debuggers and memory/thread profilers are often associated with a specific compiler to be used only with the accompnying compiler, however some may work across compiler suites as well. Also included here are a couple generic debugger/profilers.

GNU Debugger (GDB)

GDB is part of the GNU project and is available on all nodes by default. Documentation is available on the GDB website and by typing "man gdb".

Valgrind

Valgrind is a common tool-chain for memory-profiling and debugging. Documentation is avaible on the Valgrind website or by typing "man valgrind".

Intel Debugger (IDB)

The Intel debugger comes with the Intel compiler suite in 32- and 64-bit flavors. Installations of IDB are available in the Intel compiler suite area:

/usr/local/intel/idb/current (32-bit)
/usr/local/intel/idbe/current (64-bit)

IDB Documentation

Portland Group Debugger

The PGI compilers come with a graphical debugger and memory profiler (pgdbg). Using the GUI requires X to be installed on your workstation, however it will drop to a console-only version when X is not available. The debugger is present in your PATH after sourcing the appropriate PGI set-up script (see PGI Compilers above).

Portland Group Debugger Documentation

Pathscale Debugger

The Pathscale compilers come with a command-line debugger (pathdb). The debugger is present in your PATH after sourcing the appropriate Pathscale set-up script (see Pathscale Compilers above).

Pathscale Debugger Documentation

Quick-charts
top

MPI-Wrapper/Intercontect Availability Chart by Compiler
Compiler Ethernet MPI Myrinet MPI Infinipath MPI Infiniband MPI
GCC (gcc, g++, g77, gfortran) Yes Yes -- Yes
PGI (pgcc, pgCC, pgf77/90/95) Yes Yes -- --
Intel (icc, icpc, ifort) Yes Yes -- --
Pathscale (pathcc, pathCC, pathf90/95 Yes Yes Yes --

PATH Settings for MPI by Compiler/Interconnect
Compiler Ethernet Myrinet
GCC 32-bit /usr/local/mpich2/bin /usr/local/mpich-gm2k/bin
GCC 64-bit /usr/local/mpich2-gnu64/bin --
PGI 32-bit /usr/local/mpich2-pgi/bin /usr/local/mpich-gm2k-pg/bin
PGI 64-bit /usr/local/mpich2-pgi64/bin --
Intel 32-bit /usr/local/mpich2-intel/bin /usr/local/mpich-gm2k-i/bin
Intel 64-bit /usr/local/mpich2-intel64/bin --
Pathscale 32-bit /usr/local/mpich2-pathscale/bin /usr/local/mpich-gm2k-ps/bin
Pathscale 64-bit /usr/local/mpich2-pathscale64/bin --