Oak Ridge Leadership Computing Facility

Lens User Guide

1. Lens Overview
2. Requesting Access to OLCF Resources
2.1. Project Allocation Requests
2.2. User Account Requests
3. OLCF Help and Policies
3.1. User Assistance Center
3.2. Communications to Users
3.3. My OLCF Site
3.4. Special Requests and Policy Exemptions
3.5. OLCF Acknowledgement
4. Accessing OLCF Systems
4.1. OLCF System Hostnames
4.2. General-Purpose Systems
4.3. X11 Forwarding
4.4. RSA Key Fingerprints
4.5. Authenticating to OLCF Systems
5. Data Management
5.1. User-Centric Data Storage
5.1.1. User Home Directories (NFS)
5.1.2. User Work Directories (Lustre)
5.1.3. User Archive Directories (HPSS)
5.2. Project-Centric Data Storage
5.2.1. Project Home Directories (NFS)
5.2.2. Project Work Directories (Lustre)
5.2.3. Project Archive Directories (HPSS)
5.3. Transferring Data
5.4. Storage Policy Summary
6. Software and Shell Environments
6.1. Default Shell
6.2. Using Modules
6.3. Installed Software
7. Compiling on Lens
7.1. Controlling the Programming Environment on Commodity Clusters
8. Running Jobs on Lens
8.1. Login vs Compute Nodes on Commodity Clusters
8.2. Writing Batch Scripts for Commodity Clusters
8.3. Interactive Batch Jobs on Commodity Clusters
8.4. Common Batch Options to PBS
8.5. Batch Environment Variables
8.6. Modifying Batch Jobs
8.7. Monitoring Batch Jobs
8.8. Batch Queues on Lens
8.9. Job Execution on Commodity Clusters
8.9.1. Serial Job Execution on Commodity Clusters
8.9.2. Parallel Job Execution on Commodity Clusters
8.9.3. Resource Sharing on Commodity Clusters
8.9.4. Task-Core Affinity on Commodity Clusters
8.10. Job Accounting on Commodity Clusters
8.11. Lens Scheduling Policy

1. Lens Overview

Lens is a (77)-node commodity-type Linux cluster. The primary purpose of Lens is to provide a conduit for large-scale scientific discovery via data analysis and visualization of simulation data generated on Titan. Users with accounts on Titan-supported projects will automatically be given an account on Lens.

High-memory Nodes

(45) High-memory nodes are configured with (4) 2.3 GHz AMD Opteron processors and (128) GB of memory.

GPU Nodes

The remaining (32) GPU nodes are configured with (4) 2.3 GHz AMD Opteron processors and (64) GB of memory. These nodes also contain an NVIDIA 8800 GTX GPU with (768) MB of memory and an NVIDIA Tesla GPGPU with (4) GB of memory.

File Systems

The OLCF’s center-wide Lustre file system, named Spider, is available on Lens for computational work. With over 52,000 clients and (10) PB of disk space, it is the largest-scale Lustre file system in the world. A separate, NFS-based file system provides $HOME storage areas, and an HPSS-based file system provides Lens users with archival spaces.

	INCITE	Director’s Discretion	ALCC
Allocations	Large	Small	Large
Call for Proposals	Once per year	At any time	Once per year
Closeout Report	Required	Required	Required
Duration	1 year	1 year	1 year
Job Priority	High	Medium	High
Quarterly Reports	Required	Required	Required
	Apply for INCITE	Apply for DD	Apply for ALCC

Email	help@olcf.ornl.gov
Phone:	865-241-6536
Fax:	865-241-4011
Address:	1 Bethel Valley Road, Oak Ridge, TN 37831

System Name	Hostname	RSA fingerprint
Titan	`titan.ccs.ornl.gov`	`--`
Lens	`lens.ccs.ornl.gov`	`cc:6e:ef:84:7e:7c:dc:72:71:7b:76:7f:f3:46:57:2b`
Everest	`everest.ccs.ornl.gov`	`cc:6e:ef:84:7e:7c:dc:72:71:7b:76:7f:f3:46:57:2b`
Smoky	`smoky.ccs.ornl.gov`	`e3:88:b9:ba:fe:3a:fd:99:00:24:fc:e6:9d:5c:69:2b`
Sith	`sith.ccs.ornl.gov`	`28:63:5e:41:32:39:c2:ec:9b:63:e0:86:16:2f:e4:bd`
Data Transfer Nodes	`dtn.ccs.ornl.gov`	`50:dc:59:7b:e1:7c:ad:b2:30:55:9c:fa:fb:e8:6e:55`
Home (machine)	`home.ccs.ornl.gov`	`12:9b:10:f7:b9:c7:1b:a2:b0:52:5e:13:e2:b9:b2:8c`

Area	Nickname	Path	Type	Quota	Backups	Purge	Retention
User Home	–	/ccs/home/$USER	NFS	5 GB	Yes	Not purged	1 month after account deactivation
User Work	“Spider”	/tmp/work/$USER	Lustre	None	No	Files > 14 days old subject to deletion	Not retained
User Archive	“HPSS”	/home/$USER	HPSS	2 TB (or 2k files)	No	Not purged	3 months after account deactivation

Area	Nickname	Path	Type	Quota	Backups	Purge	Retention
Project Home	--	`/ccs/proj/[projectid]`	NFS	50 GB	Yes	Not purged	1 month after project deactivation
Project Work	"Spider"	`/tmp/proj/[projectid]`	Lustre	2 TB	No	Not purged	Not retained
Project Archive	"HPSS"	`/proj/[projectid]`	HPSS	100 TB (or 100k files)	No	Not purged	3 months after project deactivation

	GridFTP + GridCert	GridFTP + SSH	SFTP/SCP	BBCP
Data Security	insecure (default) / secure (w/configation)	insecure (default) / secure (w/configation)	secure	insecure (unsuited for sensitive projects)
Authentication	GridCert	Passcode	Passcode	Passcode
Transfer speed	Fast	Fast	Slow	Fast
Required Infrastructure	GridFTP server at remote site + user DOE GridCert	GridFTP server at remote site	Comes standard with SSH install	BBCP installed at remote site

Area	The general name of storage area/directory discussed in the storage policy.
Nickname	The branded name given to some storage areas or file systems.
Path	The path (symlink) to the storage area's directory.
Type	The underlying software technology supporting the storage area.
Quota	The limits placed on total number of bytes and/or files in the storage area.
Backups	States if the data is automatically duplicated for disaster recovery purposes.
Purged	States when data will be marked as eligible for permanent deletion.
Retention	States when data will be marked as eligible for permanent deletion after an account/project is deactivated.

	Area	Nickname	Path	Type	Quota	Backups	Purged	Retention
User	Home	--	`/ccs/home/$USER`	NFS	5 GB	Yes	Not purged	1 month
	Work	"Spider"	`/tmp/work/$USER`	Lustre	None	No	14 days	Not retained
	Archive	"HPSS"	`/home/$USER`	HPSS	2 TB or 2k files	No	Not purged	3 months
Project	Home	--	`/ccs/proj/[projid]`	NFS	50 GB	Yes	Not purged	1 month
	Work	"Spider"	`/tmp/proj/[projid]`	Lustre	2 TB	No	Not purged	1 month
	Archive	"HPSS"	`/proj/[projid]`	HPSS	100 TB or 100k files	No	Not purged	3 months

Command	Description
`module list`	Lists modules currently loaded in a user’s environment
`module avail`	Lists all available modules on a system in condensed format
`module avail -l`	Lists all available modules on a system in long format
`module display`	Shows environment changes that will be made by loading a given module
`module load`	Loads a module
`module unload`	Unloads a module
`module help`	Shows help for a module
`module swap`	Swaps a currently loaded module for an unloaded module

`-I`	Start an interactive session
`-A`	Charge to the `abc123` project
`-q qname`	Run in the `qname` queue
`-V`	Export the user's shell environment to the job's environment
`-l nodes=4:ppn=4`	Request (16) cores...
`-l walltime=30:00:00`	...for (30) minutes

Option	Use	Description
`-A`	`#PBS -A <account>`	Causes the job time to be charged to `<account>`. The account string, e.g. `pjt000`, is typically composed of three letters followed by three digits and optionally followed by a subproject identifier. The utility `showproj` can be used to list your valid assigned project ID(s). This option is required by all jobs.
`-l`	`#PBS -l nodes=<value>`	Maximum number of compute nodes. Jobs cannot request partial nodes.
	`#PBS -l walltime=<time>`	Maximum wall-clock time. `<time>` is in the format HH:MM:SS.
	`#PBS -l gres=<filesystem>`	Associate batch job with one or more Lustre filesystems. Valid options are `widow1`, `widow2`, and `widow3`. Include multiple filesystems like `widow2%widow3`. Useful to omit associations in the event of file system outages.
`-o`	`#PBS -o <filename>`	Writes standard output to `<name>` instead of `<job script>.o$PBS_JOBID`. `$PBS_JOBID` is an environment variable created by PBS that contains the PBS job identifier.
`-e`	`#PBS -e <filename>`	Writes standard error to `<name>` instead of `<job script>.e$PBS_JOBID.`
`-j`	`#PBS -j {oe,eo}`	Combines standard output and standard error into the standard error file (`eo`) or the standard out file (`oe`).
`-m`	`#PBS -m a`	Sends email to the submitter when the job aborts.
	`#PBS -m b`	Sends email to the submitter when the job begins.
	`#PBS -m e`	Sends email to the submitter when the job ends.
`-M`	`#PBS -M <address>`	Specifies email address to use for `-m` options.
`-n`	`#PBS -N <name>`	Sets the job name to `<name>` instead of the name of the job script.
`-S`	`#PBS -S <shell>`	Sets the shell to interpret the job script.
`-q`	`#PBS -q <queue>`	Directs the job to the specified queue.This option is not required to run in the default queue on any given system.
`-V`	`#PBS -V`	Exports all environment variables from the submitting shell into the batch job shell.

Variable	Description
`$PBS_O_WORKDIR`	The directory from which the batch job was submitted. By default, a new job starts in your home directory. You can get back to the directory of job submission with `cd $PBS_O_WORKDIR`. Note that this is not necessarily the same directory in which the batch script resides.
`$PBS_JOBID`	The job’s full identifier. A common use for `PBS_JOBID` is to append the job’s ID to the standard output and error files.
`$PBS_NUM_NODES`	The number of nodes requested.
`$PBS_NUM_PPN`	The number of cores requested.
`$PBS_JOBNAME`	The job name supplied by the user.
`$PBS_NODEFILE`	The name of the file containing the list of nodes assigned to the job. Used sometimes on non-Cray clusters.

Active	These jobs are currently running.
Eligible	These jobs are currently queued awaiting resources. A user is allowed two jobs in the eligible state. Eligible jobs are shown in the order in which the scheduler will consider them for allocation.
Blocked	These jobs are currently queued but are not eligible to run. Common reasons for jobs in this state are jobs on hold and the owning user currently having (2) jobs in the eligible state.

Queue Name	Queue Type	Max. Walltime	Available Node Types	Max. Jobs		Preemption Policy
Queue Name	Queue Type	Max. Walltime	Available Node Types	Running	Total	Preemption Policy
`comp`	computation	06:00:00	All (77) nodes	(1)	(2)	Can be preempted by jobs in analysis queues
`comp_gpu`			GPU nodes only
`comp_mem`			High-memory nodes only
`vis`	analysis	24:00:00	All (77) nodes	N/A	N/A	Can preempt jobs in computation queues
`vis_gpu`			GPU nodes only
`vis_mem`			High-memory nodes only

`-display-map`	Can be used to view layout
`-npernode`	Number of cores per node
`-n`	Number of total cores

hpss

lens

smoky

Lens User Guide

Contents

High-memory Nodes

GPU Nodes

File Systems

Project Type Details

After Project Approval

Steps to Obtain a User Account

Hours

Contact Us

After Hours

Ticket Submission Webform

OLCF Announcements Mailing Lists

OLCF “Notice” Mailing Lists

Weekly Update

System Status Pages

Mobile Apps

Twitter

Message of the Day

Activating a new SecurID® fob

Using a SecurID® fob

User Home Path

User Home Quotas

User Home Backups

User Home Permissions

Special User Website Directory

User Work Path

User Work Backup

User Work Purge

User Work Permissions

The /tmp directory

User Archive Path

User Archive Access

User Archive Accounting

Project Home Path

Project Home Quotas

Project Home Backups

Project Home Permissions

Project Work Path

Project Work Backup

Project Work Permissions

Project Archive Path

Project Archive Access

Project Archive Accounting

Data Transfer Nodes

Local Transfers

SPDCP

HSI and HTAR

Remote Transfers

GridFTP

SFTP and SCP

BBCP

Storage Policy Summary Table

Storage Policy Implementation

User Home / Project Home Quotas (NFS)

User Archive / Project Archive Quotas (HPSS)

Check Current Archive Usage

Modules Overview

Summary of Module Commands

Re-initializing the Module Command

Examples of Module Use

Available Compilers

Changing Compilers

Changing Versions of the Same Compiler

General Programming Environment Guidelines

Login Nodes

Compute Nodes

Components of a Batch Script

Interpreter Line

PBS Submission Options

Shell Commands

Example Batch Script

Interpreter Line

PBS Options

Shell Commands

Using to Debug

Choosing a Job Size

Activating a new SecurID^® fob

Using a SecurID^® fob

`SPDCP`

`HSI` and `HTAR`

`SFTP` and `SCP`

`BBCP`

`qdel`

`qhold`

`qrls`

`qalter`

`showq`

`checkjob`

`qstat`

Using `mpirun`

On the Command Line via `showusage`