NCCS | User Info | search  

LoadLeveler on Eagle


Contents


Introduction

All parallel jobs and resource-intensive sequential jobs must be run through LoadLeveler. (If the login node gets overloaded with user processes, we may be forced to halt processes that use more than their fair share of the login node.)

LoadLeveler is the batch-job scheduler for Eagle. It also allocates nodes for interactive parallel jobs. This document provides information for getting started with the batch facilities of LoadLeveler.


Classes

In the LoadLeveler parlance, the term "class" is analagous to the term "queue" for other batch systems. Different users may have access to different classes, and different classes may have different job limits or may target different nodes.

Use the "llclass" command to see the current list of classes.

$ llclass
Name              MaxJobCPU      MaxProcCPU   Free  Max   Description
                  d+hh:mm:ss     d+hh:mm:ss   Slots Slots            

sys             -1             -1             274   710   System administration
No_Class        -1             -1             0     1     
batch           -1             -1             260   696   Batch jobs
interactive     -1             -1             274   710   Interactive POE jobs
climate_prod    -1             -1             260   696   Production climate runs
climate_dev     -1             -1             252   696   Development climate runs
bio             -1             -1             260   696   Bioinformatics servers
eere            -1             -1             260   696   Energy efficiency, renewable energy

The "batch" class is the default class for jobs submitted as LoadLeveler scripts. The "interactive" class is the default class for interactive-shell "poe" jobs. Other classes are for specific sets of users, such as system administrators.

Each "Slots" number represents the number of "job instances" that may be started in the given class. For MPI jobs, this is the number of MPI processes that may run under the given class. It is typically equivalent to the number of processors that allow the class. "Max Slots" represents the total number of slots configured on the system, and "Free Slots" represents the number of slots that are not currently occupied.

This number is misleading, however. A four-processor node may have four slots each for five classes, for example. Because nodes are typically dedicated to a single job, only four of the node's 20 slots can be allocated at a time. The rest appear to be "free", although they are not useable.

"MaxJobCPU" and "MaxProcCPU" indicate the per-job and per-process aggregate CPU time limits. None of the classes listed here have CPU time limits; this is not particularly useful information because the classes do have wall-clock time limits.

You can get more information on a class, such as it's wall-clock time limit, using "llclass -l".

$ llclass -l batch
=============== Class batch ===============
                Name: batch
            Priority: 0
               Admin: 
           NQS_class: F
          NQS_submit: 
           NQS_query: 
      Max_processors: -1
             Maxjobs: -1
       Class_comment: Batch jobs
    Wall_clock_limit:   0+12:00:00, -1
       Job_cpu_limit: -1, -1
           Cpu_limit: -1, -1
          Data_limit: -1, -1
          Core_limit: -1, -1
          File_limit: -1, -1
         Stack_limit: -1, -1
           Rss_limit: -1, -1
                Nice: 0
                Free: 216
             Maximum: 696

The most useful information here is the "Wall_clock_limit", which is set to twelve hours. This is a hard upper limit for any job submitted to the "batch" class. The "-1" indicates there is no soft limit. You may wish to "grep" for useful nuggets like this from the full listing, like in the following example.

$ llclass -l | egrep "Name|Wall_clock_limit"
                Name: sys
    Wall_clock_limit:   1+00:00:00, -1
                Name: No_Class
    Wall_clock_limit: -1, -1
                Name: batch
    Wall_clock_limit:   0+12:00:00, -1
                Name: interactive
    Wall_clock_limit:   0+02:05:00, -1
                Name: climate_prod
    Wall_clock_limit:   1+00:00:00, -1
                Name: climate_dev
    Wall_clock_limit:   0+06:00:00, -1
                Name: bio
    Wall_clock_limit:   0+12:00:00, -1
                Name: eere
    Wall_clock_limit:   1+00:00:00, -1

Some classes have limits on the number of nodes a single job can request (though "batch" and "interactive" currently do not). Unfortunately, "llclass" does not reveal such limits; you have to look directly at the LoadLeveler administration file. The following command will "grep" out the appropriate entries from the appropriate file. It can be run from any Eagle node.

$ egrep "type = class|wall_clock_limit|max_node" /home/loadl/LoadL_admin
batch:          type = class
                wall_clock_limit = 12:00:00
interactive:    type = class
                wall_clock_limit = 02:05:00
sys:            type = class
                wall_clock_limit = 24:00:00
climate_prod:   type = class
                wall_clock_limit = 24:00:00
                max_node = 32
climate_dev:    type = class
                wall_clock_limit = 6:00:00
bio:            type = class
                wall_clock_limit = 12:00:00
eere:           type = class
                wall_clock_limit = 24:00:00

Brave souls may want to look through "/home/loadl/LoadL_admin" for more information. Descriptions of the contents of this file can be found in the online documentation.


System status

Through the "Free Slots" entries, the "llclass" command can give some information about the status of the system and what your chances are for running jobs immediately. As mentioned above, however, this information is misleading. For more accurate information about the load on the system, use the "llstatus" command.

$ llstatus
Name                      Schedd  InQ Act Startd Run LdAvg Idle Arch      OpSys
eagle01s.ccs.ornl.gov     Avail     0   0 Busy     4 4.01  9999 RS6000    AIX43    
eagle02s.ccs.ornl.gov     Avail     0   0 Busy     4 4.00  9999 RS6000    AIX43    
...
eagle99s.ccs.ornl.gov     Avail     0   0 Idle     0 0.00  9999 RS6000    AIX43    
morgan.ccs.ornl.gov       Avail     0   0 Down     0 0.00     0 RS6000    AIX43    

RS6000/AIX43              185 machines      28  jobs    480  running
Total Machines            185 machines      28  jobs    480  running

The Central Manager is defined on morgan.ccs.ornl.gov

All machines on the machine_list are present.

The "Schedd" column indicates whether the node is able to schedule LoadLeveler jobs; "Avail" means it can. "InQ" gives the number of current jobs submitted from (not running on) the given node, and "Act" gives the number of those jobs that are actually running (on other nodes). "Startd" indicates whether any jobs are running on the given node, and "Run" indicates the number of job instances that are running.

For ORNL SP nodes, the "Run" number will be equal to or less than the number of processors in that node. The same job can have more than one instance running on a given node; for example, a four-processor node may have four MPI processes from the same job. SP nodes are typically dedicated, however, so a given node will only run instances of a single job at a time, though it may run multiple instances of that job.

"LdAvg" is the Berkely one-minute load average, and "Idle" is the time in seconds since the last keyboard or mouse activity on the node. For SP nodes, "Idle" is often "9999".

The lines at the bottom of the output indicate that 185 nodes are currently under the control of LoadLeveler. On these nodes, 28 jobs are running, and those 28 jobs consume 480 slots. Because one slot can represent a single-thread or multiple-thread process, slots are neither equivalent to processors nor nodes.

Some of the columns of default "llstatus" output are not particularly useful, and "llstatus" is capable of displaying useful information that is not shown by default. To remedy this, you can configure the output generated by "llstatus" on the command line. Here is an example configuration.

$ llstatus -f %n %mt %r %l %v %scs %sts
Name                     MaxT Run    LdAvg  FreeVMemory Schedd  Startd  
eagle01s.ccs.ornl.gov    4    4      4.02   5241480     Avail   Busy    
eagle02s.ccs.ornl.gov    4    4      4.00   5312692     Avail   Busy    
...
eagle99s.ccs.ornl.gov    4    0      0.01   5270680     Avail   Idle    
morgan.ccs.ornl.gov      0    0      0.00   -1          Avail   Down    

RS6000/AIX43              185 machines      28  jobs    480  running
Total Machines            185 machines      28  jobs    480  running

The Central Manager is defined on morgan.ccs.ornl.gov

All machines on the machine_list are present.

This example prunes out some of the default information and adds "MaxT" and "FreeVMemory". "MaxT" gives the maximum number of job instances (regardless of class) that may run on the given host at a time, and "FreeVMemory" gives the available swap space, in kilobytes. See "man llstatus" for more information on configuring output. You may want to create an "alias" for the "llstatus" configuration you prefer.


Job command files

To run a batch job under LoadLeveler, you first need to write a job command file. Here is an example file for a parallel MPI job.

#@ shell = /bin/ksh
#@ job_type = parallel
#@ output = $(host).$(jobid).out
#@ error = $(host).$(jobid).err
#@ wall_clock_limit = 30:00
#@ network.MPI = csss,not_shared,US
#@ tasks_per_node = 4
#@ node = 8
#@ node_usage = not_shared
#@ queue
pwd
echo $LOADL_PROCESSOR_LIST
export MP_SHARED_MEMORY=yes
poe a.out

The file has two components: LoadLeveler keyword statements and shell commands. The LoadLeveler keyword statements are preceded by "#@", making them appear as comments to a shell. The shell commands follow the "#@ queue" keyword statement and represent the executable content of the batch job.

Here is a description of each line. The script has no line specifying "class", so the default class, "batch", will be used.

#@ shell = /bin/ksh

Use the Korn shell, "ksh", to interpret the command file. By default, LoadLeveler interprets the command file using your login shell. The sample script is written in "ksh" syntax, so the explicit request of "ksh" allows it to work regardless of your login shell. If you prefer to use C-shell syntax, make the following changes to the sample command file.

Korn shellC shell
#@ shell = /bin/ksh#@ shell = /bin/csh
export MP_SHARED_MEMORY=yessetenv MP_SHARED_MEMORY yes

Warning for "csh" users: Make sure that your command file ends with a newline. If it does not, LoadLeveler will not execute the last command in your file. You can use the following command to check your file.
tail command_file
You need to add a newline (using "return" or "enter" in an editor) if the next command prompt appears on the same line as the last line of the file. Here is an example of such a case.
eagle163s% tail csh.ll
#@ error = $(host).$(jobid).err
#@ wall_clock_limit = 30:00
#@ network.MPI = csss,not_shared,US
#@ tasks_per_node = 4
#@ node = 8
#@ node_usage = not_shared
#@ queue
pwd
echo $LOADL_PROCESSOR_LIST
setenv MP_SHARED_MEMORY yes
poe a.outeagle163s% 
If a newline is not added to this file, the command "poe a.out" will not be executed when the job runs!

#@ job_type = parallel

Use multiple nodes for parallel commands. This keyword is required for parallel jobs. The keywords "tasks_per_node", "node", "network", etc. won't work without it.

#@ output = $(host).$(jobid).out

Send standard output to the file "$(host).$(jobid).out". "$(host)" is a LoadLeveler variable that represents the host where the job was submitted. It is not necessarily related to where the job runs. "$(jobid)" is a number ID of the running job. Each "$(jobid)" is unique for a given job submitted from a particular host. Each "$(jobid)" is not necessarily unique across LoadLeveler; two jobs submitted from two different hosts can have the same value for "$(jobid)". The combination of "$(host).$(jobid)" is unique, however. Example: "131.out" and "131.out" versus "eagle164s.131.out" and "eagle163s.131.out".

Unless you specify a full path, the output file is stored in the directory from which you submitted the job. If you don't specify the "output" keyword, the standard output is not saved.

#@ error = $(host).$(jobid).err

Send standard error output to the file "$(host).$(jobid).err". See the information above for the "output" keyword. You can send standard output and standard error to the same file.

#@ wall_clock_limit = 30:00

Limit the job to 30 minutes of real time. If you do not specify a "wall_clock_limit", your job will get the default limit of two hours, regardless of class. For jobs longer than two hours, you must specify a longer limit. For shorter jobs, specifying a shorter time limit may allow the scheduler to fit your job in earlier.

#@ network.MPI = csss,not_shared,US

For MPI communication, use the SP switch with the User Space protocol in non-shared mode. This line requests that parallel MPI programs use the fastest form of internode communication available on the SP, User Space (US) protocol over the SP switch (device "csss"). Specifying non-shared mode guarantees exclusive use of the switch adapter on each assigned node for this job.

A separate "network" keyword is allowed for IBM's Low-Level Application Programming Interface, "network.LAPI", and for implementations of PVM not built on top of IBM's MPI, "network.PVM". LAPI may use the US protocol, but generic PVM can only use IP.

#@ tasks_per_node = 4

Use four tasks per node for parallel jobs. A task is equivalent to a process, and a single task may have multiple threads. This line specifies that four tasks, four MPI processes in this case, should be started on each node.

#@ node = 8

Allocate 8 nodes for parallel commands. Yes, the keyword is "node", not "nodes". See below to specify what kind of nodes you want according to processor count.

#@ node_usage = not_shared

Do not allow any other LoadLeveler jobs on the allocated nodes. This line guarantees that LoadLeveler will schedule no other jobs on the nodes assigned to this job for the duration of the job. It is otherwise possible for LoadLeveler to share nodes between multiple jobs. For parallel jobs, this is usually not desirable.

#@ queue

Queue the job! This keyword is critical. Without it, no job is created. Each "queue" keyword uses the environment specified by the keywords listed before it, so make sure to put it after the other relevant keywords.


The remaining lines of the file specify the shell commands to be executed by the batch job. All sequential commands, such as the first three commands in this example, run on only the first node allocated to the job. Parallel commands start multiple processes spread across all allocated nodes.

pwd

Display the name of the current working directory. The job starts in the directory where the job was submitted. This behavior is different from some other batch systems, which always start jobs in the user's home directory.

echo $LOADL_PROCESSOR_LIST

Display the nodes allocated to this job. LoadLeveler automatically sets the value of the environment variable "LOADL_PROCESSOR_LIST" to a list of the nodes allocated for the given job. Printing this list in each job can help diagnose system problems. If you have more than 128 tasks, however, do not print this variable. LoadLeveler has trouble printing this for more than 128 tasks; it may cause your job to fail.

export MP_SHARED_MEMORY=yes

Use shared memory for MPI. IBM's MPI can now implement communication within a node using shared memory. This implementation greatly improves the bandwidth and latency of on-node communication without affecting communication between nodes. Because it uses extra memory, this implementation is not used by default, however. Setting the "MP_SHARED_MEMORY" environment variable to "yes" turns it on. The above line does this under "ksh". For "csh", use the following command instead.

setenv MP_SHARED_MEMORY yes

To take advantage of this shared-memory optimization, an MPI code must be compiled with the thread-safe version of the MPI library, i.e. using "mpxlf_r" or "mpcc_r".

poe a.out

Run 32 copies of "a.out" across 8 nodes. If "a.out" is not a parallel program, this command will run 32 identical copies on 8 different nodes. If "a.out" is parallel (compiled with "mpxlf", "mpcc", etc.), it will run as a single 32-process application across 8 nodes. Specifying "poe" is optional for programs compiled to be parallel. POE options specified through LoadLeveler keyword commands ("node", "tasks_per_node", "network", etc.) override options on the "poe" command line.

A nice feature LoadLeveler provides is the ability to define job prolog and epilog scripts. If you have steps that should take place at the beginning or end of all your jobs, you can define a prolog and/or epilog script and code these steps once for all your jobs.

Before starting your job, LoadLeveler looks for an environment variable named $MY_CCS_LOADL_PROLOG. If it contains the path of an executable file, that file is run before starting your job script. If $MY_CCS_LOADL_PROLOG is not defined and $HOME/llprolog exists and is executable, it will be run.

Similarly, after your job completes, if $MY_CCS_LOADL_EPILOG is defined and contains the path of an executable file, that file will be run. If $MY_CCS_LOADL_EPILOG is not defined, but $HOME/llepilog exists and is executable, it will be run.

Generally, defining and exporting these environment variables in your .profile (or setting them with setenv in your .login if you use a csh variant), is sufficient to define them to LoadLeveler.


Submitting jobs

Use "llsubmit" to submit a job command file for batch execution.

$ llsubmit command_file
llsubmit: Processed command file through Submit Filter: "/opt/bin/llsubmitfilter".
llsubmit: The job "eagle163s.ccs.ornl.gov.12765" has been submitted.

The job shell will inherit the working directory from where you submitted the job. Also, unless you use full path names, the standard output and standard error files will be saved in this same directory.

If you forget to supply a "wall_clock_limit", your job will get the default limit, regardless of class.

$ llsubmit command_file
/opt/bin/llsubmitfilter: WARNING:  wall_clock_limit is set to "2:05:00, 2:00:00"
llsubmit: Processed command file through Submit Filter: "/opt/bin/llsubmitfilter".
llsubmit: The job "eagle163s.ccs.ornl.gov.12766" has been submitted.

Some classes have limits on the number of nodes a single job can request (though "batch" and "interactive" currently do not). Unfortunately, "llclass" does not reveal such limits. You may first discover the limit at submit time.

$ llsubmit command_file
llsubmit: Processed command file through Submit Filter: "/opt/bin/llsubmitfilter".
llsubmit: 2512-135 For the "node" keyword, maximum number of nodes requested is greater than allowed for this "class".
llsubmit: 2512-051 This job has not been submitted to LoadLeveler.

Unfortunately, "llsubmit" does not report what the limit actually is. See above to see how to list such limits.


Job status

Use "llq" to check the status of submitted jobs.

$ llq
Id                       Owner      Submitted   ST PRI Class        Running On 
------------------------ ---------- ----------- -- --- ------------ -----------
eagle163s.12813.0        ernie      11/27 04:20 R  50  batch        eagle09s   
eagle163s.12816.0        ernie      11/27 04:50 R  50  batch        eagle172s  
eagle163s.12820.0        ernie      11/27 08:10 R  50  batch        eagle06s   
eagle163s.12814.0        grover     11/27 04:29 R  50  batch        eagle03s   
eagle163s.12815.0        grover     11/27 04:30 R  50  batch        eagle27s   
eagle101s.218.0          zoe        11/27 08:41 R  1   batch        eagle103s  
eagle163s.12846.0        bert       11/27 09:40 I  50  batch                   
eagle163s.12848.0        elmo       11/27 09:42 I  50  batch                   
eagle163s.12850.0        bert       11/27 09:46 I  50  batch                   
eagle163s.12851.0        bert       11/27 09:50 I  50  batch                   
eagle163s.12852.0        bert       11/27 09:52 I  50  batch                   
eagle163s.12853.0        bert       11/27 09:54 I  50  batch                   
eagle163s.12854.0        bert       11/27 09:56 I  50  batch                   
eagle163s.12856.0        bert       11/27 09:58 I  50  batch                   
eagle163s.12860.0        bert       11/27 10:03 I  50  batch                   
eagle163s.12861.0        herry      11/27 10:08 I  50  batch                   
eagle163s.12862.0        oscar      11/27 10:08 I  50  batch                   
eagle163s.12863.0        cookie     11/27 10:22 I  50  batch                   
eagle163s.12865.0        kermit     11/27 11:07 I  50  batch                   

19 job steps in queue, 13 waiting, 0 pending, 6 running, 0 held

The first column is the name of each job step, the second column is the owner of the job, and the third column is the time when the job was first submitted to LoadLeveler. The "ST" column gives the status of each job. Here are some common status values.

R Running
ST STarting
I Idle, waiting for resources
H Held by the user
S held by the System
RP Remove Pending, being removed

The "PRI" column gives the user priority of the job, though this priority is not currently used in making scheduling decisions. The "Class" column gives the class specified in the job command file ("batch" is the default). The final column, "Running On", gives the first node assigned to each running job. Only this first node appears, even for parallel jobs running on many nodes.

Some of the columns of default "llq" output are not particularly useful, and "llq" is capable of displaying useful information that is not shown by default. To remedy this, you can configure the output generated by "llq" on the command line. Here is an example configuration.

$ llq -f %o %id %nh %st %dd %dq
Owner       Step Id                  NM   ST Disp. Date  Queue Date  Running On              
----------- ------------------------ ---- -- ----------- ----------- ------------------------
kermit      eagle163s.12865.0        0    I              11/27 11:07                         
cookie      eagle163s.12863.0        0    I              11/27 10:22                         
oscar       eagle163s.12862.0        0    I              11/27 10:08                         
herry       eagle163s.12861.0        0    I              11/27 10:08                         
bert        eagle163s.12860.0        0    I              11/27 10:03                         
bert        eagle163s.12856.0        0    I              11/27 09:58                         
bert        eagle163s.12854.0        0    I              11/27 09:56                         
bert        eagle163s.12853.0        0    I              11/27 09:54                         
bert        eagle163s.12852.0        0    I              11/27 09:52                         
bert        eagle163s.12851.0        0    I              11/27 09:50                         
bert        eagle163s.12850.0        0    I              11/27 09:46                         
elmo        eagle163s.12848.0        0    I              11/27 09:42                         
bert        eagle163s.12846.0        0    I              11/27 09:40                         
ernie       eagle163s.12866.0        0    I              11/27 11:20                         
grover      eagle163s.12815.0        56   R  11/27 04:30 11/27 04:30 eagle27s                
ernie       eagle163s.12816.0        8    R  11/27 04:50 11/27 04:50 eagle172s               
zoe         eagle101s.218.0          32   R  11/27 08:41 11/27 08:41 eagle103s               
ernie       eagle163s.12813.0        8    R  11/27 04:20 11/27 04:20 eagle09s                
ernie       eagle163s.12820.0        16   R  11/27 08:10 11/27 08:10 eagle06s                
grover      eagle163s.12814.0        56   R  11/27 04:29 11/27 04:29 eagle03s                

In addition to the owner, job name, and status, this format gives "NM", the number of nodes used by the job, "Disp. Date", the time the job was started, and "Queue Date", the time the job was queued. See "man llq" for more information on configuring output. You may want to create an "alias" for the "llq" configuration you prefer.

As an alternative to "llq", we provide the local utility "llqn", which lists a different set of job characteristics. To list all the characteristics available from "llqn", use the "-a" option.

$ llqn -a
Job Id                         Owner    Class        SysPrio S Date
Node
------------------------------ -------- ------------ ------- - ---------------- ----
eagle163s.ccs.ornl.gov.12815.0 grover   batch        -4837086 R  Nov 28 04:30      56
eagle163s.ccs.ornl.gov.12816.0 ernie    batch        -4842014 R  Nov 28 04:50       8
eagle101s.ccs.ornl.gov.218.0   zoe      batch        -4842039 R  Nov 28 08:41      32
eagle163s.ccs.ornl.gov.12813.0 ernie    batch        -4740231 R  Nov 28 04:20       8
eagle163s.ccs.ornl.gov.12820.0 ernie    batch        -4718494 R  Nov 28 08:10      16
eagle163s.ccs.ornl.gov.12814.0 grover   batch        -4718498 R  Nov 27 12:29      56

eagle163s.ccs.ornl.gov.12865.0 kermit   batch        -4747045 I  Nov 27 11:07     144 
eagle163s.ccs.ornl.gov.12863.0 cookie   batch        -4749527 I  Nov 27 10:22      80
eagle163s.ccs.ornl.gov.12862.0 oscar    batch        -4836911 I  Nov 27 10:08      16
...
Unlike "PRI" with "llq", "SysPrio" is an accurate representation of the scheduling priority; the job with the largest (least negative) priority is scheduled next. Jobs with lower priority can skip ahead if they can fit in holes in the scheluled job mix. This is called backfilling.

"Date" means different things for running and waiting ("I") jobs. For waiting jobs, "Date" is the queue time. For running jobs, "Date" is the latest time the job will finish, based on the start time and the wall-clock limit.

See "man llqn" for more details.

Why isn't my job running?

You can verify that your job is not running by checking the "ST" column of "llq" output. You can then use "llq -s" with the job name to find out why it isn't running. The output created by "llq -s" is long, so you may want to pick out the useful lines using "sed". The following example demonstrates how to display lines of "llq -s" output between the line "SUMMARY" and the line "ANALYSIS".

$ llq -s eagle163s.12865.0
...
(pages of information)
...

$ llq -s eagle163s.12865.0 | sed -n '/SUMMARY/,/ANALYSIS/p'
SUMMARY

This LoadLeveler cluster does not have sufficient resources at the present time
to run this job step.

ANALYSIS

The LoadLeveler cluster may not have sufficient resources for a variety of reasons. Nodes may be busy with other jobs, for example. Unfortunately, LoadLeveler cannot distinquish between a temporary reduction of resources and permanent system limitations. Therefore, if a job requests more nodes than the system has, the job will wait, and "llq -s" will return the message above, despite the fact that the job will never be able to run.

What nodes is my job using?

You can use "llq -l" to display detailed information about LoadLeveler jobs, including a list of the nodes allocated for each job. You can use "grep" to isolate this node list. If the job command file is written to use only SP nodes, you need only "grep" for "eagle"; otherwise, you also need to search for "bearcat", "bobcat", and "morgan".

$ llq -l eagle101s.218.0 | grep "gov::"
   Allocated Hosts : eagle103s.ccs.ornl.gov::csss(0,MPI,us),csss(1,MPI,us),csss(2,MPI,us),csss(3,MPI,us)
                   + eagle171s.ccs.ornl.gov::csss(0,MPI,us),csss(1,MPI,us),csss(2,MPI,us),csss(3,MPI,us)
                   + eagle22s.ccs.ornl.gov::csss(0,MPI,us),csss(1,MPI,us),csss(2,MPI,us),csss(3,MPI,us)
                   + eagle28s.ccs.ornl.gov::csss(0,MPI,us),csss(1,MPI,us),csss(2,MPI,us),csss(3,MPI,us)
                   + eagle36s.ccs.ornl.gov::csss(0,MPI,us),csss(1,MPI,us),csss(2,MPI,us),csss(3,MPI,us)
                   ...

Notice that each line of this example has 4 "csss" entries. This indicates that 4 MPI processes are running on each node.


Stopping jobs

You can use "llcancel" with a list of job names to cancel those jobs. The command removes waiting jobs and aborts running jobs.

$ llcancel eagle163s.12816.0
llcancel: Cancel command has been sent to the central manager.

You can also keep a job from running without removing it from LoadLeveler using "llhold" with a list of job names. You can then use "llhold -r" to release held jobs and allow them to run.

$ llhold eagle163s.12817.0 
llhold: Hold command has been sent to the central manager.
$ ...
...
$ llhold -r eagle163s.12817.0 
llhold: Hold command has been sent to the central manager.

The "llhold" command has no effect on running jobs.


Documentation

Eagle has "man" pages for each of the LoadLeveler commands. Full PDF documentation is also available from IBM's website. Note that we are currently using LoadLeveler Version 3 Release 1.

http://www-1.ibm.com/servers/eserver/pseries/library/sp_books/loadleveler.html

The document entitled Using and Administering is particularly useful.

For more information on "poe" options, see the "man" page or the online documentation for IBM's Parallel Environment (PE). We are currently at PE version 3 release 2.

http://www-1.ibm.com/servers/eserver/pseries/library/sp_books/pe.html

phoenix | ram | cheetah | eagle
ornl | nccs | ccs | computers | disclaimer

URL http://www.ccs.ornl.gov/eagle/LL.html
Updated: Friday, 18-Feb-2005 17:14:53 EST
consult@ccs.ornl.gov