Data Management Policy
Categories: Data Management, Policies
- Principal Investigators (Non-Profit)
- Principal Investigators (Industry)
- All Users
Title: Data Management Policy
Version: 12.10
User Storage Areas
Users are provided with several storage areas, each of which serve different purposes. These areas are intended for storage of data for a particular user and not for storage of project data.
There are (3) subcategories of User storage areas: User Home, User Work, and User Archive.
User Home
Home directories for each user are NFS-mounted on OLCF systems. This is online disk space intended for long-term, frequently used storage, and is backed up on a daily basis. This file system does not generally provide the input/output (I/O) performance required by most jobs. A User Home directory has a 5GB quota.
User Work
Individual User Work directories reside in the center-wide high-capacity Lustre file system on large, fast disk areas intended for global (parallel) access to temporary/scratch storage. User Work directories are provided across each system. Because of the scratch nature of the file system, it is not backed up and files are automatically purged on a regular basis. Files should not be retained in this file system and should be migrated to archival space as soon as the files are not actively being used.
If a file system associated with User Work storage runs out of free space or gets low on free space, then data on that file system will be subject to involuntary deletion without prior notice.
User Archive
The High Performance Storage System (HPSS) is the archival storage system at the OLCF. Space on HPSS is intended for files that are not immediately needed.
Each user archive space belongs to a single user. An individual user has a 2TB space quota and a 2,000 file quota. User archive areas in HPSS are created in /home/[username].
User archive data stored on HPSS will only be retained up to three months past user account termination. There is no lifetime retention. Users are expected to migrate their data off HPSS before their account is deactivated.
Project Storage Areas
Projects are provided with several storage areas for the data they need. Project directories provide members of a project with a common place to store code, data files, documentation, and other files related to their project. While this information could be stored in one or more user directories, storing in a project directory provides a common location to gather all files.
There are (3) subcategories of Project storage areas: Project Home, Project Work, and Project Archive.
Project Home
Project home directories are NFS-mounted on OLCF systems. This is online disk space intended for long-term, frequently used storage and is backed up on a daily basis. This file system does not generally provide the input/output (I/O) performance required by most jobs. The standard project directory has a 50GB quota.
Project Work
Project work directories reside in the center-wide high-capacity Lustre file system on large, fast disk areas intended for global (parallel) access. This is online disk space intended for long-term, frequently used storage, but this area is not backed up. An individual project directory has a 2TB quota. Project spaces on the center-wide high-capacity Lustre directories may be denied or revoked at any time.
Project Archive
The High Performance Storage System (HPSS) is the archival storage system at the OLCF. Space on HPSS is intended for files that are not immediately needed.
Project archive areas are shared between all users of the project. Project archive areas quotas have 100TB space quotas and 100,000 file number quotas. Project areas in HPSS are created in /proj/[projectid].
Larger quotas may be granted for a project on a case-by-case basis; the projects must include storage needs in their proposals along with strategies for data migration after the project has ended. Even though HPSS is a very large storage system, space is not unlimited. Users must not store files unrelated to their OLCF projects on HPSS. They must also periodically review their files and remove unneeded ones. Quotas will be enforced.
Projects should not duplicate executables or data sets in each user’s area, but instead should set privileges to share a master copy. Contact the User Assistance Center for help in setting privileges to facilitate sharing.
Project data stored on HPSS will only be retained up to three months past the end of a project allocation. There is no lifetime retention. Projects are expected to migrate their data off HPSS before project deactivation.
Local Scratch Storage
A large, fast disk area intended for parallel access to temporary storage in the form of scratch directories may be provided on a limited number of systems. This area is local to the specific system. This directory is, for example, intended to hold output generated by a user’s job. Because of the scratch nature of the file system, it is not backed up and files are automatically purged on a regular basis. Files should not be retained in this file system and should be migrated to archival storage as soon as the files are not actively being used.
If a file system runs out of or gets low on space, data on that file system will be subject to involuntary deletion without prior notice. Quotas may be instituted on a machine-by- machine basis if deemed necessary.
Purge and Retention Policies
There are three purge/retention policy types: the Scratch Purge Policy, Project Purge Policy, and the User Purge Policy. Please note there is no lifetime retention for any data on OLCF machines. Projects and users are expected to migrate data off of OLCF systems when projects and users are finished with OLCF systems. When a project ends, active users in that project will have limited timeframes to migrate all data off of OLCF file systems.
Scratch Purge Policy
Scratch purge policies are intended to maintain file systems so there is always a large amount of scratch space available for executing jobs. Files are automatically purged on a regular basis. Any file older than 14 days is subject to deletion. If a file system runs out of free space or gets low on free space, data on that file system will be subject to involuntary deletion without prior notice.
User Purge Policy
User purge policies are intended to reclaim space after a user account is disabled for any reason. When a user account is deactivated prior to the user migrating needed data off of OLCF storage areas, the user must request a temporary data access account extension, and then will have no more than 1 month to remove all user data from OLCF global and local storage systems, and no more than 3 months to remove all user data from OLCF archival storage systems.
In cases where project data is stored in a User Home area, a project representative must submit a request for data from the user home area within 1 month of the user account being disabled.
In cases where a deadline for an extension is missed, OLCF reserves the right to delete all files and directories designated as user data across all applicable storage spaces.
Project Purge Policy
Project purge policies are intended to reclaim space after a project ends. When a project ends, project members will have no more than 1 month to remove all data from OLCF global and local storage systems, and no more than 3 months to remove all data from OLCF archival storage systems. If a project needs additional time to remove project data from OLCF systems, a project member must request an extension.
In cases where a project misses the deadline for an extension, OLCF reserves the right to delete all files and directories designated as project data across all applicable storage spaces.
Summary
Area | The general name of storage area/directory discussed in the storage policy. |
---|---|
Nickname | The branded name given to some storage areas or file systems. |
Path | The path (symlink) to the storage area’s directory. |
Type | The underlying software technology supporting the storage area. |
Quota | The limits placed on total number of bytes and/or files in the storage area. |
Backups | States if the data is automatically duplicated for disaster recovery purposes. |
Purged | States when data will be marked as eligible for permanent deletion. |
Retention | States when data will be marked as eligible for permanent deletion after an account/project is deactivated. |
Area | Nickname | Path | Type | Quota | Backups | Purged | Retention | |
---|---|---|---|---|---|---|---|---|
User | Home | – | /ccs/home/$USER |
NFS | 5GB | Yes | Not purged | 1 month |
Work | “Spider” | /tmp/work/$USER |
Lustre | None | No | 14 days | Not retained | |
Archive | “HPSS” | /home/$USER |
HPSS | 2TB or 2k files | No | Not purged | 3 months | |
Project | Home | – | /ccs/proj/[projid] |
NFS | 50GB | Yes | Not purged | 1 month |
Work | “Spider” | /tmp/proj/[projid] |
Lustre | 2 TB | No | Not purged | 1 month | |
Archive | “HPSS” | /proj/[projid] |
HPSS | 100TB or 100k files | No | Not purged | 3 months | |
Consequences of Abuse
Storage usage will be monitored continually. When time permits, offenders will be warned to clean up their space. Ignoring these warnings will result in loss of access privileges.