3.1 Modifications for .cshrc and .login startup files on discover.
It may be necessary to put architecture specific options into your login files to set up the environment you desire. When using bash, sh, or ksh, you may include the following syntax in your .profile, .login, or .bashrc file to set up a different environment when logging into Linux Networx systems:
SYSTYPE=`/bin/uname -m`
if [ "$SYSTYPE" == "" ] ; then
SYSTYPE=`/usr/bin/uname -m`
if [ "$SYSTYPE" == "" ] ; then
echo "Cannot determine system architecture"
fi
fi
if [ "$SYSTYPE" != "ia64" -a "$SYSTYPE" != "IP35" -a
"$SYSTYPE" != "x86_64" ] ; then
echo "Invalid system architecture"
fi
if [ "$SYSTYPE" == "IP35" ] ; then
echo "Put SGI IRIX specifics here"
elif [ "$SYSTYPE" == "ia64" ] ; then
echo "Put SGI Altix specifics here"
elif [ "$SYSTYPE" == "x86_64" ] ; then
echo "Put Linux Networx specifics here"
fi
When using csh or tcsh, the following syntax may be included into your .cshrc file to set up a different environment when logging into the NCCS systems:
switch ( `uname -m` )
case x86_64:
setenv SYSTYPE x86_64
echo "Put LNXI specifics here"
breaksw
case ia64:
setenv SYSTYPE ia64
echo "Put SGI Altix specifics here"
breaksw
case IP35:
setenv SYSTYPE IP35
echo "Put SGI IRIX specifics here"
breaksw
default:
setenv SYSTYPE unknown
echo "Unable to determine SYSTYPE"
breaksw
endsw
Similar things can be done based on hostname or O/S type. Adding the above to your login files will ensure that you set up the correct software modules or system specific environment variables at login.
Back to index
4. Setting up authorization keys on NCCS systems
Creating a new authorized_keys file on dirac-
If it does not yet exist, create the .ssh directory under your home directory on dirac:
mkdir .ssh
Create your public identity file, id_dsa.pub, on one of the NCCS HEC systems:
ssh-keygen -t dsa
Copy the file id_dsa.pub into authorized_keys on the system for which you just generated it. If the file authorized_keys already exists on the system, append the contents of id_dsa.pub.
Copy the contents of the authorized_keys file into dirac:
scp -oport=2222 my_home_dir/.ssh/authorized_keys
mylogin@dirac.gsfc.nasa.gov:.ssh/authorized_keys
Note that the above may overwrite the file on dirac.
Once the authorized_keys file is in place on both dirac and the NCCS HPC system, you can start using passwordless scp and sftp.
4.1 Adding to an existing authorized_keys file
For each NCCS high-performance computing system for which you do not yet have an entry in your dirac authorized_keys file:
Log into that NCCS high-performance computing system and create your public identity file, id_dsa.pub:
ssh-keygen -t dsa
The id_dsa.pub file will be in the .ssh directory under your home directory on the NCCS high-performance computing system.
Copy your existing authorized_keys file on dirac into a temporary file, but leave the passphrase empty, a public and private key will be generated:
scp -oport=2222 mylogin@dirac.gsfc.nasa.gov:.ssh/authorized_keys dirac_auth
Concatenate the temporary holder and your id_dsa.pub file into a new file. To append:
cat id_dsa.pub >> authorized_keys
Replace the authorized_keys file on dirac with the contents of the new file:
scp -oport=2222 dirac_auth mylogin@dirac.gsfc.nasa.gov:.ssh/authorized_keys.
Back to index
5. Transferring code and data files from the user's system (scp).
You cannot transfer files, such as data sets or application files, directly to the LNXI front end; you must use the mass storage platform (dirac) to transfer files from outside into the NCCS environment.
Use the secure copy command (scp) to transfer your code or data from your workstation to dirac. For more information, issue the command man scp or see our ssh page for more information.
To scp the file "myfile" from your workstation's home directory to your dirac home directory, issue the following command:
scp /home/myfile your_userid@dirac.gsfc.nasa.gov:myfile
Similar to an initial login, you will be asked to provide both your PASSCODE and your password. Note that this is your dirac password and not your LNXI discover password, which may be different.
Depending on whether or not your login directory is set up to be your home diretory on dirac, files may be transferred either to your home directory or your mass storage directory. Your home directory, mass storage directory, nobackup file systems, and others are available to dirac. Note that it is important to check your origin and destination directories carefully before transferring any files to avoid accidentally overwriting data.
If your workstation allows inbound scp connections, you may issue the scp command (secure copy) on the HPC system, to transfer your code from your workstation to the HPC system. Consult man scp for more information.
For example, suppose you have the file file1 in the /tmp directory of your workstation and you would like a copy on dirac. Issue the following command on dirac:
scp your_workstation_userid@your_workstation:/tmp/file1 ./file1
You will be prompted for your workstation password, and then the transfer will be made.
If your workstation does not allow inbound scp connections, the transfer will need to be initiated on your
workstation. Using the example above, issue the following command to dirac (which is the file system) on your workstation:
scp /tmp/file1 mylogin@dirac.gsfc.nasa.gov:full_path_name
Enter PASSCODE: 10-digit passcode
Password: dirac-password
path_name 100% |*****************************| size 00:00
5.1 Interactive file transfer with sftp
If you have access to our mass storage system, then you can use sftp to transfer files to dirac for long-term storage. See discover documentation on data storage for more information.
% sftp mylogin@dirac.gsfc.nasa.gov
Connecting to dirac.gsfc.nasa.gov ...
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'dirac.gsfc.nasa.gov' (RSA)
to the list of known hosts.
Enter PASSCODE: 10-digit passcode
Password: dirac-password
sftp> get [file(s) or put file(s) or help]
5.2 Transferring files and code under Microsoft Windows
Most Windows-based ssh clients support some form of scp or sftp. Cygwin supports unix-like command line or X-windows programs.
Back to index
6. Compiling and running code (f90, cc, prun, etc.).
There are compilers from different vendors on the system. The one available to you for use is determined by the module you currently have loaded. See the section on modules above and the section "On using MPI".
The invocation of the compilers to compile a source code depends which vendor's compiler you have loaded.
For Intel, the compilers are invoked as ifort, icc and icpc for Fortran, C, and C++ respectively.
6.1 Paths to Libraries
Please see the section on modules. To display a list of environment variables and other information about individual modules, invoke the following.
% module show <module-name>
6.2 Memory and limits.
At present, the best advice for users is to "force" their own limits depending on the nature of the jobs. For example, for a sequential or OpenMP job confined to one node, the user might be advised to set both memoryuse and vmemoryuse to 3145728, i.e. 3 GB in units of KB. For jobs with multiple processes per node, the user might be advised to set those same limits to approximately 3145728 / <number-of-processes>. This way, jobs will die quickly when they hit these limits as opposed to "sleeping" in the queue until the queue's time limit is reached.
Back to index
7. Loading, unloading or swapping modules.
Several versions of compilers from different vendors and other support applications are available for users on the Linux Networx systems. These applications are loaded into your environment through the use of modules. When you log into the Linux Networx system, no modules are loaded by default. To see which modules you currently have loaded, issue the following command:
% module list
To see all available modules, issue this command:
% module avail
The above will display a complete list of modules that are available to be loaded into your environment. You can load, unload, and even swap modules using the module command. To display a list of environment variables and other information about an individual module, invoke the following.
% module show <module-name>
Other commonly-used module commands are:
% module load module_name
to load a module called "module_name" from the list of available modules.
% module unload module_name
to unload an already loaded module called "module_name".
% module swap loaded_module new_module
to remove the module called "loaded_module" and replace it with the module called "new_module".
This command is typically used to switch between two different versions of the same application or compiler. It may also be used to switch between compilers from different vendors.
An example of how to use the module command to prepare the environment for either running or developing MPI applications follows.
% module load comp/intel-9.1.042 mpi/scali-5.1.0.1 lib/mkl-9.0.017
This command loads the Intel version 9.1.042 compiler suite, the Scali version 5.1.0.1 MPI libraries, and Intel's Math Kernel Library, version 9.0.017. Note the major version numbers of the MKL and Intel compiler suite should match, here version 9. To remove all modules from the environment use:
% module purge
It may be necessary to "unsetenv" or explicitly clear some environment variables, check with NCCS User Support if this seems to be a problem. For more information about the module command, see the man module page.
For more information about the module command, see the module man page.
Back to index
8. On using the Message Passing Programming Model (MPI)
To compile programs using MPI and link with the necessary MPI libraries, it is necessary to prepare the environment using the module command. This includes both the development and the run-time environments (see below). Additional compiler or mpirun flags may be required to obtain particular functionalities, such as RDMA for example. These flags may vary among the different compiler and MPI vendors.
Set up the environment before compiling.
$ module purge
$ module load comp/intel-9.1.042 mpi/sst-3.3.0.8.4 lib/mkl-9.0.017
$ # Silver Storm MPI Stack requires the following environment variables to
$ # point to the current compiler suite.
$ MPICH_CC=icc
$ MPICH_CLINKER=icc
$ MPICH_F77=ifort
$ MPICH_FLINKER=ifort
To compile:
$ mpicc -c myprogram1.c
$ mpif77 -c myprogram2.f
To link:
$ mpicc -o mpyprogram1.x mypgrogram1.o
$ mpif77 -o mpyprogram2.x mypgrogram2.o
To run the program one must wrap the MPI executable in a shell script that loads the appropriate environment. An example script, "run.sh", follows.
#!/bin/bash
# run.sh: A script to prepare the environment for an MPI executable.
# Prepare the environment
. /usr/share/modules/init/bash
module purge
module load comp/intel-9.1.042 mpi/sst-3.3.0.8.4 lib/mkl-9.0.017
./myprogram1.x
# end of run.sh
This script, "run.sh", and the executable must be copied to a location accessible to the compute nodes on which it will run. Because most such runs will be executed as batch jobs, it makes sense to write another script which prepares the directories, copies the appropriate files, and then starts the run. An example of such a script, entitled "go.sh", follows.
#!/bin/bash
# go.sh: Get things ready and then execute run.sh.
# # If used as a PBS script, put PBS commands here.
# Prepare the environment
. /usr/share/modules/init/bash
module purge
module load comp/intel-9.1.042 mpi/sst-3.3.0.8.4 lib/mkl-9.0.017
# Prepare directories and stage files.
# Execute a 4 process (4 cores on the LNXI) run.
mpirun -np 4 run.sh
# Collect data and clean up as necessary.
# end of go.sh
These two files, "go.sh" and "run.sh", should have execute permission set (chmod +x go.sh run.sh).
8.1 Differences between vendors' MPI distributions
On the LNXI cluster, the vendor provided implementations of MPI provide mpicc, mpif77, and related scripts that should correctly handle include and link dependencies. It can happen on occasion that a compile or link search may hit on an incompatible default (or other) compiler, MPI, or library suite and signal an error. Purging and reloading the environment's modules, cleaning, and recompiling may help.
Even after using "module" to load the environment, the MPI libraries may still require certain environment variables to be set to the appropriate compilers and linkers. If these are not set properly, then a system default may be executed yielding unexpected results and possibly affecting performance. Examples are detailed here (users of csh-style shells should use "set" or "setenv" as appropriate) for the Intel compiler suite. For other compilers, "icc" and "ifort" would be replaced with the equivalents, e.g. for GCC, "gcc" and "g77" respectively, in the following.
# MPICH, SCALI, and SILVERSTORM development require
$ MPICH_CC=icc
$ MPICH_CLINKER=icc
$ MPICH_F77=ifort
$ MPICH_FLINKER=ifort
# INTEL MPI development requires
$ MPICH_CC=icc
# SCALI MPI development requires
$ DEFAULT_CC=icc
$ DEFAULT_CC=icc
$ DEFAULT_F77=ifort
$ DEFAULT_F77=ifort
These can be set before or after loading modules. The module loads and these environment variables may be `sourced' from shell scripts as described above.
Initially NCCS will provide only one MPI environment and users wishing to tune their application against another variant should contact NCCS User Support.
8.2 Cray-style SHMEM parallelism
Several previous NCCS machines have supported so-called SHMEM parallelism, an efficient message-passing mechanism introduced by Cray. This non-standard mechanism is unfortunately not supported on Discover, and users are encouraged to port their applications to MPI. Please contact the NCCS if you desire assistance with such a conversion.
Back to index
9. The OpenMP Programming Model
OpenMP is an extension to standard Fortran, C, and C++ that supports shared memory parallel execution. Intel has recently introduced a new product "Cluster OpenMP" which enables OpenMP to be used across multiple nodes of the platform. Interested users can contact NCCS User Support for more information.
Users can fairly easily add directives to their source code to parallelize their applications and specify certain properties of their variables. To compile your application with OpenMP, issue the -openmp option to the Intel compiler. The Intel compiler supports the OpenMP 2 standard, and MPI and OpenMP can be used to create a hybrid mixed-mode parallel programming model. More information is available in the Intel Fortran compiler User's Guides.
9.1 Threads on LNXI discover vs. SGI Altix explore.
Although the number of threads is in principle arbitrary, in practice
most users will find that 4 threads are optimal due to the constraint
that there are only 4 independent processing elements within a single
shared-memory node on Discover. This is quite different than the
situation on the SGI Altix architecture where the number of processing
elements is potentially quite large.
Back to index
10. Silverstorm InfiniBand
Training on how to program clusters based on the LNXI discover architecture may be available, contact NCCS User Support for information.
Back to index
11. Submitting jobs to queues ( PBS, qsub, qstat, etc.).
11.1 qsub
To access the compute hosts in the Linux Networx environment, you
must submit jobs to the batch queues. For more information about the
available batch queues on the Linux Networx system and the amount of
resources that can be requested. For more information consult man
pbs.
In general, you will create a batch script and then issue that
batch script to PBS using the following command:
% qsub myscript
This assumes that all the necessary requirements are included in
the batch script itself using comments. Note that you must provide
your Computational Project (formerly Sponsor Code Account) while
running a batch script. Use the getsponsor command to get your
Computational Project information.
To see the status of your job, issue the following command:
% qstat -a
Since all compute hosts in the Linux Networx environment must be
accessed through the PBS batch system, the only way to run an
interactive job on one of the compute engines is through the
following command:
% qsub -I
To specify the total number of CPUs and wallclock time, you may include
those options at the command line. For example, suppose you wanted 16 CPUs for
a total of 4 hours to run some interactive work. For PBS, use "select" and
"ncpus". Summarizing (in script form):
# Do this (or the like via qsub)...
#PBS -l select=<NODES>:ncpus=<CPU-per-NODE>
.
# And do this...
mpirun -np <any-number-upto-NODES-times-CPUS-per-NODE> ...
In general, can range from 1 process to the product � processes. For example, in an interactive batch session:
% qsub -V -I -l select=4:ncpus=4,walltime=4:00:00
The above requests 4 nodes and 4 cpus per node. This will allow you to run your MPI application via mpirun with anywhere from 1 to 16 processes. For example:
% mpirun -np 16 ...
This will launch 16 processes with 4 processes per node. Do not use the -npn option to mpirun. The -V option of qsub ensures your environment is exported to the PBS session.
In some cases, your job will not be started immediately but will
start when sufficient resources become available.
11.2 Permissions in PBS & MPI (umask, chmod...)
PBS honors umask just fine, but if you create a file from within your mpirun job, umask is not honored. Do a chmod after your mpirun to assure that any files created during your mpirun have the permissions you desire.
11.3 PBS Sample Script
#!/usr/local/bin/csh
#PBS -S /bin/csh
###-S sets job's shell path
#PBS -N Test_PBS_Job
###-N sets job's name
#PBS -l select=9:ncpus=4,walltime=00:10:00
###-l sets job's resource list
#PBS -j oe
###-j joins the Standard error and standard output into one file.
###(separate files are generated by default)
#PBS -W group_list=computational_project
###-W specifies the Computational Project under which the job will run.
###and from which the cpu hours will be deducted
# By default, PBS executes your job from your home directory.
# However, you can use either the environment variable
# PBS_O_WORKDIR to change to the current working directory
or you can declare a new variable by using setenv.
cd $PBS_O_WORKDIR
##or
##setenv my_work_dir /home0/myuserid/workdir
##cd $my_work_dir
cp ./Test_PBS_Job /discover/nobackup/myuserid
cd /discover/nobackup/myuserid
# mpirun -np 34
# or mpirun -np 36
## note mpirun -np < a number in 1..36 >
ls -la > PBS_output
# copy output files back to the PBS working directory
cp PBS_output $PBS_O_WORKDIR
##or cp output $my_work_dir
exit 0
11.4 Commands for monitoring jobs - qstat, df, etc.
11.6 Monitoring currently running jobs (stderr/stdout).
/discover/pbs_spool is a 200 GB GPFS filesystem that is a globally visible spool dir. The local spool directory on all compute nodes is now a sym-link that point to this global spool dir. You should be able to monitor job err/output by going to this directory and finding the appropriate files by their jobids. As with the SGIs, users should not edit or remove any files in this directory or unpredictable things may happen. The intermediate output files have names such as <job-number>.<node-of-submission>.OU , for example:
userid@discover01:/discover/pbs_spool> ls
1008.borgmg.OU 1224.borgmg.OU 1249.borgmg.OU
1390.borgmg.OU 1628.borgmg.OU
1036.borgmg.OU 1225.borgmg.OU 1256.borgmg.OU
1396.borgmg.OU 1705.borgmg.OU
Please note: this filesystem is not set up for I/O performance or for handling large stderr/stdout files. It is expected that small amounts of text-only output will be written here (and moved back to submission directories at the conclusion of a job. If users have large text I/O requirements, they should be writing directly to a file on /nobackup/<userid>/* and not using stdout.
Any non-PBS files that show up in this directory are subject to deletion at any time and without warning. This filesystem is for PBS spool use only.
If PBS cannot place a stderr/stdout file where it thinks it should go, then it will place the file in /discover/pbs_spool/undelivered.
11.5 Useful links and references for PBS.
Back to index
12. Performance Analysis, debugging, options and specifications control, Intel debugger
Each compiler suite typically provides debugging and optimization tools. On the NCCS LNXI system, these include:
- idb, an Intel debugger
- gdb, GNU's open source debugger
- totalview, from Etnus
Performance analysis and profiling tools include:
- Intel VTune
- Intel Trace Analyzer & Collector
12.1 Trace, open source, and data display debugger, gdb, pgdgb, codecov, totalview, trace analyzer, trace collector and equivalents
TRACE/Analyzer debugger that supports Fortran data types, syntax, and use. The following commands create (compile and link) the executable program and invoke the character-cell interface to the Ladebug debugger:
% f90 -g ???? -o squares squares.f90
% ???? squares
Description of how the command, Trace/Analyze works.
Totalview, a well known debugging tool, is also installed on system for graphical debugging of serial and parallel codes:
% totalview a.out
Trace Analyzer (and Trace Collector) is another tool for analyzing MPI code performance.
% f90 vt.f -lVT -lmpi
% traceanalyzer
After issuing the above commands, you can then load the respective trace file from the "File" menu.
Back to index
13. Etnus's Totalview
To use Totalview in the interactive batch, try the following.
- Compile your code with the "-g" option to ensure source level debugging.
- Set up ssh keys for passwordless connection to the nodes.
- Set up the Totalview environment, for example:
module load tool/tview-8.0.0.0
- If you are running MPI across more than one node, set the environment variable TVDSVRLAUNCHCMD to ssh.
export TVDSVRLAUNCHCMD=ssh
or
setenv TVDSVRLAUNCHCMD ssh
- Submit the job with "qsub -V -I ...", so that the DISPLAY environment is passed into the PBS job environment.
- There are several ways to launch Totalview.
- For MPI code using "mpi/scali-5.3", launch Totalview as follows.
mpirun -tv -np <number-of-processes> <your-executable>
The -tv tells Scali to run with Totalview.
- For sequential code, run Totalview as follows.
totalview <your-executable>
- For OpenMP code, set the OMP_NUM_THREADS environment variable to the desired number of threads, four or fewer for our 4-core nodes, and launch as follows.
totalview <your-executable>
Back to index
14. Software libraries and applications
- NetCDF (Network Common Data Format) - is a set of interfaces for array-oriented data access. Further information regarding netCDF can be found at: http://www.unidata.ucar.edu/software/netcdf
- PAPI - The PAPI (Performance Application Programming Interface) library from the Innovative Computing Laboratory at the University of Tennessee-Knoxville will be available on the new NCCS linux cluster. PAPI is an effort to establish a uniform, standard programming interface for accessing hardware performance counters on modern microprocessors. The PAPI web site is located at: http://icl.cs.utk.edu/papi/
- TAU (Tuning and Analysis Utilities) - is a portable profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C, C++, Java, Python. The TAU website is located at: http://www.cs.uoregon.edu/research/tau/home.php
- Open SpeedShop - Open|SpeedShop is an open source multi platform Linux performance tool which is initially targeted to support performance analysis of applications running on both single node and large scale IA64, IA32, EM64T, and AMD64 platforms. It is explicitly designed with usability in mind and targets both application and computer scientists. Open|SpeedShop's base functionality includes metrics like exclusive and inclusive user time, MPI call tracing, and CPU hardware performance counter experiments. In addition, Open|SpeedShop is designed to be modular and easily extendable. It supports several levels of plugins which allow users to add their own performance experiments. Further information can be found at: http://oss.sgi.com/openspeedshop
- MPICH - a common implementation of MPI. further information can be found at: http://www-unix.mcs.anl.gov/mpi/mpich/index.htm
- LAM-MPI - implementation of the Message Passing Interface environmentnetworxapps.txt for running applications on clusters. Further information regarding LAM-MPI can be found at the following website http://www.lam-mpi.org/.
- G95 - open source Fortran 95 compiler and runtime libraries. Additional information can be found at http://www.g95.org/
- FFTW - a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, i.e. the discrete cosine/sine transforms or DCT/DST). Further information regarding FFTW can be found at http://www.fftw.org/.
- HDF - The HDF software includes I/O libraries and tools for analyzing, visualizing, and converting scientific data. Further information regarding HDF can be found at http://hdf.ncsa.uiuc.edu/.
- SCSL (the SGI/Cray Scientific Library) will not be available on the system.
Back to index
15. Visualization and graphics tools
The following tools are planned to be available on the Linux Networx systems for NCCS Users.
- NCAR - Libraries and utilities for contour maps vector and streamline plots, X-Y graphs, map databases, and other visualization tools.
- NCAR - Libraries and utilities for contour maps vector and streamline plots, X-Y graphs, map databases, and other visualization tools.
- GrADS - further information can be found at http://www.hipersoft.rice.edu/grads/
- IDL - IDL's features include: advanced image processing, interactive 2-D and 3-D graphics, object oriented programming, insightful volume visualization, a high-level programming language, integrated mathematics and statistics, flexible data I/O, a cross-platform GUI toolkit, and versatile program linking tools.
Back to index
16. Using the data storage system to store and retrieve files
Back to index
17. Storage Quotas
The use of some file systems is controlled by quotas (see below). To determine your resource usage and how it compares to your quota, try the showquota command.
% showquota
If the limits imposed by quotas are a problem for you, please contact User Services.
Back to index
18. Data Storage and the File Systems
Several different types of file systems are provided for user and system use. In general, LNXI nodes (login, gateway, and compute) all see the GPFS file system. In contrast, filesystems external to the LNXI cluster are not available tothe compute nodes, but instead are accessible (via NFS mounts) to the login and gateway nodes which are responsible for data transfer.
- Home File Systems
- Available for login, gateway, and compute nodes.
- Nobackup
- Generally used to store large working file (input and output) used for running applications, post processing, analysis, etc.
- Is not backed up; any files that need to be saved for long periods should be copied into the mass storage directories.
- Scratch
- Set up on each compute host; a temporary directory is created at the time a PBS batch job begins running.
- Accessed via the $TMPDIR environment variable and is the fastest performing file system.
- Temporary storage area created for the life of the PBS batch job; any data that needs to be saved must be removed before the job is completed.
- Mass Storage
- At some point DMF file systems "/g[1-8]" might become available to the LNXI system login and gateway nodes, but not the compute nodes. In the meantime, data transfer between the cluster and mass storage via sftp and scp is discussed elsewhere in this document.
- See the DMF documentation for more information.
One way to make data ready for a large compute run is to first submit a job to the datamove queue in the PBS system to copy files from a mass store or other location to, for example, a /discover/nobackup file system. Jobs in the datamove queue run on a cluster gateway node that has access to external, archive, and cluster-wide file systems. Once the data is on a file system visible to the compute nodes, a compute job using these data can be executed. The results of the compute job can be saved back out to the mass store or other user system using another job indatamove. These batch scripts can submit succeeding scripts, if the expected waits in the queues are not too long. Alternatively, one can submit the three jobs, two to the datamove and one to a compute queue, and then use PBS's capability to make one job depend on the completion of other jobs. Review PBS job dependencies and the -W depends=dependency_list argument to PBS for more information.
17.1 LNXI (discover) File System Access & Policies
|