Scientific Supercomputing at the NIH

Disk Storage on the Helix Systems

Disk Storage /home/... /scratch/... & /scratch/export /data/...
Purpose home directory temporary files large files
Accessible from all systems all systems except Biowulf computational nodes all systems
Helixdrive address \\helixdrive\[user] \\helixdrive\scratch \\helixdrive\data
Quotas yes no
Note that this space is shared by all users
yes
Creation with Helix account created by user with Biowulf account
Backups daily none daily
Snapshots hourly, nightly, weekly nightly nightly, weekly
Time Limits none deleted after 14 days none
Charged yes no no

Home, data and scratch directories actually reside on disks attached to specialized file servers. They are made accessible to the other systems via the Network File System (NFS).

To determine how much disk space you are using, use the checkquota command:

[user@helix ~]$checkquota Mount Used Quota Percent Files /data7: 796.7 MB 48.0 GB 1.62% 2838 /home: 330.0 MB 1.0 GB 33.00% 3563 mailbox: 64.5 MB 200.0 MB 32.23%

Users who require higher quotas should send email to staff@helix.nih.gov to make their request. The email should include the quota requested, a brief justification of the need for the increased quota, and an estimate of how long the space will be needed (if appropriate).

Home Directories (/home/...)

A home directory is established for each new user when his or her Helix account is created, with the pathname /home/username. The initial quota is 1 GB. Files in home directories may be considered permanent in the sense that they remain on the system as long as the user wishes. Snapshots are made of home directories every three hours. Users who are over quota at the time of login will receive a warning message at that time.

Charge

Disk storage in home directories is charged at the rate $0.0033/megabyte/day.

Temporary Storage (/scratch/...)

Instead of /tmp or /usr/tmp, the /scratch directory should be used for the temporary storage of relatively large amounts of data. We recommend that a user create a directory in /scratch and place files there. The /scratch directory is accessible from Helix and the Biowulf head node. While /tmp and /usr/tmp are the traditional temporary directories on UNIX systems, users are discouraged from making use of them on the NIH Helix Systems since they are often used by system utilities, compilers, and application programs. Files in /tmp and /usr/tmp are subject to removal at any time.

Files in /scratch which have not been accessed for 14 days are automatically deleted. The usefulness of /scratch depends on users' cooperation. It is inappropriate to regularly 'touch' files in /scratch as a means for extending the 14 day time limit. Users who require more permanent disk space should request it.

Files on /scratch are not backed up to tape, but snapshots are taken nightly. Disk storage on /scratch is not charged.

Network Disk Storage (/scratch/export)

/scratch/export, a subdirectory of /scratch, has the same characteristics as /scratch, but in addition is exported as part of the Network File System (NFS). This space is accessible from workstations which can act as NFS clients. Data in this directory should not be considered secure since it is accessible from the network. The /scratch/export filesystem is shared between all SGI systems that are part of the Helix Systems. To mount /scratch/export using NFS, use the following mount command:

mount helixscratch.nih.gov:/vol/scratch/export /mnt

where /mnt is a directory present on your workstation.

/data

Users who require very large datasets (typically Biowulf cluster users) have additional storage in the directory /data/username. This area is accessible from Helix, the Biowulf head node and the Biowulf computational nodes.

Mapped Network Drives and Desktop File Access

/home, /data, and /scratch can be mounted on desktop machines via the Samba server Helixdrive. This allows simple drag/drop access to files. See here for more information.

/home Network icon

Using Disk Space Efficiently

Use the gzip program to reduce the amount of disk space used by infrequently accessed files. Depending on the file, this can significantly reduce space requirements. For example, the disk space required for the text file below was reduced by more than 50%:

helix% ls -l -rw------- 1 quux 925594 Apr 20 12:48 TDATA helix% gzip TDATA helix% ls -l -rw------- 1 quux 334198 Apr 20 12:50 TDATA.gz

Note that compressed files are renamed by the gzip command to have an extension of .gz. The gunzip command is used to return a compressed file to an uncompressed form. The zcat command is used to produce an uncompressed byte stream from a file without uncompressing it, for example:

zcat huge-postscript-file.gz | lpr

See the gzip man page for more information.