N I H H e l i x S y s t e m s
Steven Fellini sfellini@nih.gov |
Susan Chacko susanc@helix.nih.gov |
October 24, 2008
This page is at
http://biowulf.nih.gov/seminar.html
This Biowulf User Guide is at
http://biowulf.nih.gov/user_guide.html
The Biowulf Home Page is at
http://biowulf.nih.gov
|
Important concepts
Location | Creation | Backups | Amount of Space | Accessible from (*) | |
/home | network (NFS) | with Helix account | yes | 1 GB (quota) | B,C |
/scratch (nodes) | local | created by user | no | 30 - 166 GB dedicatedwhile node is allocated | C |
/scratch (biowulf) | network (NFS) | created by user | no | 500 GB shared | B,H |
/data | network (NFS) | with Biowulf account | yes | based on quota (48 GB default) HSM managed | B,C,H |
Snapshots. (see http://helix.nih.gov/new_users/backups.html)
#!/bin/tcsh # # this file is myjob.sh # #PBS -N MyJob #PBS -m be #PBS -k oe # myprog -a 100 < /data/me/mydata |
qsub -l nodes=1 myjob.sh
Notes:
#!/bin/bash # # this file is myjob.sh # #PBS -N MyJob #PBS -m be #PBS -k oe # myprog -a 100 < infile1 > outfile1 & myprog -a 200 < infile2 > outfile2 & wait |
Notes:
qsub -I -l nodes=1Demo: Matlab on an interactive node.
Demo: easyblast
Fileservers | Network Switches | |
|
10 GbE & GbE GbE 100Base-T & GbE Myrinet 2000 Infiniband |
# of nodes | processors per node | memory | network |
232 |
4 x 2.8 GHz
AMD Opteron 290 1 MB secondary cache |
8 GB | 1 Gb/s ethernet |
471 |
2 x 2.8 GHz
AMD Opteron 254 1 MB secondary cache |
40 x 8 GB 226 x 4 GB 91 x 2 GB |
1 Gb/s ethernet 160 x 8 Gb/s Infiniband |
289 | 4 x 2.6 GHz
AMD Opteron 285 1 MB secondary cache |
8 GB | 1 Gb/s ethernet |
389 | 2 x 2.2 GHz
AMD Opteron 248 1 MB secondary cache |
129 x 2 GB 66 x 4 GB |
1 Gb/s ethernet 80 x 2 Gb/s Myrinet |
91 | 2 x 2.0 GHz
AMD Opteron 246 1 MB secondary cache |
48 x 1 GB 43 x 2 GB |
1 Gb/s ethernet 48 x 2 Gb/s Myrinet |
390 | 2 x 2.8 GHz
Intel Xeon 512 kB secondary cache |
119 x 1 GB 207 x 2 GB 64 x 4 GB |
64 x 2 Gb/s Myrinet 189 x 1 Gb/s ethernet 201 x 100 Mb/s ethernet |
1 | 32 x 1.4 GHz
SGI Itanium 2 |
96 GB | 1 Gb/s ethernet |
#!/bin/bash # # this file is myjob.sh # #PBS -N MyJob #PBS -m be #PBS -k oe # myprog -a 100 < infile1 > outfile1 & myprog -a 200 < infile2 > outfile2 & wait |
# # this file is cmdfile # myprog -param a < infile-a > outfile-a myprog -param b < infile-b > outfile-b myprog -param c < infile-c > outfile-c myprog -param d < infile-d > outfile-d myprog -param e < infile-e > outfile-e myprog -param f < infile-f > outfile-f myprog -param g < infile-g > outfile-g |
swarm -f cmdfile
32 lines (processes) => 16 jobs
2000 lines (processes) => 1000 jobs
# bundle option # swarm -b 20 -f cmdfile2000 lines (processes) => 100 bundles => 50 jobs
memory | 2.6 and 2.8 GHz Opterons (dual-core!) |
2.8 GHz Opterons |
SGI Altix 350 (1.4 GHz Itanium2) |
SUN X4600 3.0 GHz Opterons (Helix!) |
8 GB | 521 | 40 | - | - |
96 GB (shared) |
- | - | 1 (32 processors) | - |
128 GB (shared) |
- | - | - | 1 (16 processors) |
See also Firebolt web page
(http://biowulf.nih.gov/firebolt.html) for running jobs on the Altix.
property | selects | |
o2800 | 2.8 GHz Opteron processor | |
o2600 | 2.6 GHz Opteron processor | dual-core only |
o2200 | 2.2 GHz Opteron processor | |
o2000 | 2.0 GHz Opteron processor | |
k8 | 2.2 GHz Opteron processor 2.0 GHz Opteron processor | |
p2800 | 2.8 GHz Xeon processor | |
m2048 | 2 GB memory | |
m4096 | 4 GB memory | |
m8192 | 8 GB memory | reserved |
x84-64 | 64-bit Linux | |
gige | Gigabit ethernet network | |
myr2k | Myrinet network | reserved |
ib | Infiniband network | reserved |
dc | Dual-core processors 8 GB memory | reserved |
altix | SGI Altix processor | reserved |
Notes:
qsub -l nodes=1 myjob qsub -l nodes=1:x86-64 my64bitjob qsub -l nodes=8:o2800:gige -v np=16 namdjob qsub -l nodes=4:o2000:myr2k -v np=8 mdjob qsub -l nodes=16:ib -v np=32 bignamd.sh qsub -l nodes=1:m4096 bigjob.bat swarm -l nodes=1:m4096 -f bigjobs qsub -l nodes=1:altix:ncpus=4,mem=12gb verybigmem.bat
$ batchlim Max CPUs Max CPUs Per User Available ---------- ----------- ib 32 n/a norm 160 n/a nist1 32 172 norm3 32 100 nist2 16 48
$ freeib 13 nodes free in ib (large) queue, 25 in ib2 (small)
Biowulf login node (Xeon) |
compute nodes (Xeon) |
compute nodes (Opteron) |
compute nodes (Opteron/Myrinet) |
compute nodes (Opteron/IB) |
|
hardware | 64-bit | 32-bit | 64-bit | 64-bit | 64-bit |
system software/ compilers |
32-bit | 32-bit | 64-bit | 64-bit | 64-bit |
application software |
32-bit | 32-bit | 32-bit 64-bit |
32-bit 64-bit |
32-bit 64-bit |
Linux distrib | RHEL 5.2 | CentOS 5.2 | CentOS 5.2 | CentOS 5.2 | CentOS 5.2 |
Linux kernel | 2.6.18-92.1.13.el5PAE | 2.6.18-53.1.21.el5 | 2.6.18-53.1.21.el5 | 2.6.18-53.1.21.el5 | 2.6.18-53.1.21.el5 |
C library | glibc-2.5 | glibc-2.5 | glibc-2.5 | glibc-2.5 | glibc-2.5 |
Note: CentOS is a clone of RHEL
qsub -I -l nodes=1:x86-64
compiler | Front-ends | Environment Setup |
GCC 4.1.2 |
gcc (C) g++ (C++) g77 (Fortran77) gfortran (Fortran90/95) | Default |
PGI 7.2 | pgcc (C) pgCC (C++) pgf77 (Fortran77) pgf90 (Fortran90) pgf95 (Fortran95) |
% source /usr/local/pgi/pgivars.sh % source /usr/local/pgi/pgivars.csh |
Intel v10.1 | icc (C) icpc (C++) ifort (Fortran77/90/95) |
% source /usr/local/intel/intelvars.sh % source /usr/local/intel/intelvars.csh |
Pathscale 3.1 |
pathcc (C) pathCC (C++) pathf90 (Fortran77/90) pathf95 (Fortran95) |
% source /usr/local/pathscale/pathvars.sh % source /usr/local/pathscale/pathvars.csh |
PATH=<MPICH Home>/bin:$PATH (ethernet) PATH=<MPICH-GM Home>/bin:$PATH (myrinet)Infiniband only, compile on IB node:
qsub -l nodes=1:ib -I(doesn't require a special PATH)
Compiler | Ethernet (MPICH2) | Myrinet (MPICH1) | Infiniband (MPICH1) |
GNU | /usr/local/mpich2 /usr/local/mpich2-gnu64 | /usr/local/mpich-gm2k | |
PGI | /usr/local/mpich2-pgi /usr/local/mpich2-pgi64 | /usr/local/mpich-gm2k-pg | |
Intel | /usr/local/mpich2-intel /usr/local/mpich2-intel64 | /usr/local/mpich-gm2k-i | |
Pathscale | /usr/local/mpich2-pathscale /usr/local/mpich2-pathscale64 | /usr/local/mpich-gm2k-ps | Default PATH |
c | mpicc |
c++ | mpicxx |
fortran | mpif77 mpif90 |
#include <stdio.h> #include "mpi.h" int main(int argc, char **argv) { int myrank, n_processes, srcrank, destrank; char mbuf[512], name[40]; MPI_Status mstat; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); MPI_Comm_size(MPI_COMM_WORLD, &n_processes); if (myrank != 0) { gethostname(name,39); sprintf(mbuf, "Hello, from process %d on node %s!", myrank, name); destrank = 0; MPI_Send(mbuf, strlen(mbuf)+1, MPI_CHAR, destrank, 90, MPI_COMM_WORLD); } else { for (srcrank = 1; srcrank < n_processes; srcrank++) { MPI_Recv(mbuf, 512, MPI_CHAR, srcrank, 90, MPI_COMM_WORLD, &mstat); printf("From process %d: %s\n", srcrank, mbuf); } } MPI_Finalize(); } |
% PATH=/usr/local/mpich/bin:$PATH % mpicc -o hello_mpi hello_mpi.c
#!/bin/bash # This file is hello-mpich2.bat # #PBS -N Hello PATH=/usr/local/mpich2/bin:$PATH; export PATH mpdboot -f $PBS_NODEFILE -n `cat $PBS_NODEFILE | wc -l` mpiexec -n $np /home/steve/hello/hello-mpich2 mpdallexit |
qsub -v np=16 -l nodes=8 hello-mpich2.sh
Infiniband only:
% cd ~/.ssh % cp /usr/local/etc/ssh_config_ib config % chmod 600 config
#!/bin/bash # This file is hello-mpich.bat # #PBS -N MyJob #PBS -m be #PBS -k oe PATH=/usr/local/mpich/bin:$PATH; export PATH mpirun -machinefile $PBS_NODEFILE -np $np hello_mpi |
qsub -v np=8 -l nodes=4:myr2k hello-myr.sh qsub -v np=16 -l nodes=8:ib hello-ib.sh