Follow this link to skip to the main content
NASA - National Aeronautics and Space Administration

+ NASA Home
+ Ames Home

+ Sitemap
+ Staff Directory


+Home


HIGH END SYSTEMS
+ Pleiades
+ Columbia
+ Schirra

US CITIZEN
SYSTEM STATUS
ACCOUNTING
FORMS
DOCUMENTATION
SGI ALTIX 3000
IBM POWER SYSTEM
SGI ICE System
NETWORKS
STORAGE
POLICIES & PROCEDURES
FAQ


GET HELP
Hours of operation:
24 hours 7 days a week

Toll-free:1-800-331-8737
Local:650-604-4444
E-mail: support@nas.nasa.gov







STORAGE

The NAS uses three mass storage systems, Lou1, Lou2 and Lou3, to provide long-term data storage for various users of our high-end computing systems. Each user should be able to log into any of the Lou systems, but will only have storage space on one of them. You can determine which system you should store data on by which "domain" you compute in.

If you launch jobs from cfe1 and your nobackup filesystem is /nobackup1a-h, then you should store data on Lou1. If you launch jobs from cfe2 and your nobackup filesystem is /nobackup2a-g, then you should store data on Lou2. If you launch jobs from cfe3 and your nobackup filesystem is /nobackup3a-d, then you should store data on Lou3.

Your home directory can be referenced as louX:/u/your_userid, where X is 1, 2, or 3.

The storage systems are SGI Altix systems running the Linux operating system. The disk space for the three systems combined is about 240 terabytes (TB), which is split into filesystems ranging from 9-30TB in size.

Data stored on disk is migrated to tapes whenever necessary to make space for more data. Two copies of your data is written to tape media in silos located in separate buildings.

Lou1-3 have a combined ten 9840, and 20 T10000 Sun/STK tape drives. Each of the 9840 drives holds 20 gigabytes (GB) of data, while each of the T10000 tape drives hold 500GB. The total storage capacity is up to 12 petabytes.

Data migration (from disk to tape) and de-migration are managed by the SGI Data Migration Facility (DMF) and Tape Management Facility (TMF).

Information on the policies and usage guide on storing/transferring your files is provided below:



Table of Contents:

+ Transferring Files from Computing Systems to Mass Storage

+ Validating Transferred Files using md5sum

+ Quota Policy on Disk Space and Files

+ Using DMF Commands

+ Portable File Names and Sizes

+ Slow File Retrieval

+ Optimizing File Retrieval

+ Maximum Amount of Data to Retrieve Online

+ Maximum File Size Policy

+ Collecting Small Files into Single Large Files

+ GNU tar Examples

+ Tar create collection

+ Tar extraction

+ Tar list

+ CPIO Examples

+ CPIO create (write)

+ CPIO extraction (read)

+ CPIO list

+ Resend: File copy Retry on Remote System Outages or Interrupts

+ Using GPG to encrypt your data


Transferring Files from Computing Systems to Mass Storage

Your Columbia CXFS nobackup filesystem (ex: /nobackup1a-h, /nobackup2a-g) is mounted on the Mass Storage system (Lou1-3) that you are assigned to. As a result, you can log into LouX (where X is 1, 2 or 3) and copy the files from nobackup to your home directory on Lou. SGI has created a command called cxfscp, which is a tuned version of the cp command. You can copy files at up to 400MB/s sustained with cxfscp.

If you need to initiate the transfer from Columbia, then we recommend you use bbscp if the file to be transferred does not need to be encrypted. If you need to encrypt the data, even within the HEC enclave, then scp should be used.

Transferring files from Pleiades or RTJones to Mass Storage can be done with bbscp/bbftp or scp. Disk-to-disk copying to Mass Storage may be implemented in the near future.


Validating Transferred Files using md5sum

It is a good practice to check if files are copied correctly to the Mass Storage. A good way to do this is to use sum or md5sum to create checksums of the files on the source location, perform the copy, then recompute the checksums on the destination and compare them to the checksums of the originals.

For example, suppose you wanted to copy a subdirectory called project1 from cfe1 to lou. You could follow these steps:

    1. On cfe1, find all files in the directory hierarchy of project1, compute the checksums of each file, and write out the checksums to the file project1.sums
      cfe1 % find project1 -type f -print0 | xargs -0 md5sum > project1.sums
      

      Note: ' -type f ' says that you want to find regular files; ' -print0 ' says that you want to always print the exact filename.

    2. On cfe1, tar up the directory and copy the tar file to your home directory on lou1, along with the checksums
      cfe1 % tar -czf project1.tgz project1 
      cfe1 % scp project1.tgz lou1:
      cfe1 % scp project1.sums lou1:
      

      Note: ' tar -czf ' says that you want to create a new archive file, and compress the archive through gzip.

    3. On lou1, assuming that there is room in /tmp, create a temporary location. Untar the archive file stored in your home directory in the temporary location (assuming you do not want to archive each individual files in /tmp afterwards). Use md5sum to compute the checksums of each file, and compare the results with the checksums in project1.sums.
      lou1 % mkdir -p /tmp/your_username
      lou1 % cd /tmp/your_username
      lou1 % tar -xzf ~/project1.tgz
      lou1 % md5sum --check ~/project1.sums
      project1/test/README: OK
      project1/test/src/test.f: OK
      ...
      
    4. If everything reports OK, clean out the /tmp directory
      lou1 % rm -rf /tmp/your_username/project1
      


    Quota Policy on Disk Space and Files

    Some NAS filesystems enforce quotas. Two kinds of quotas are supported: limits on the total disk space occupied by the user's files, and limits on how many files the user can store, irrespective of size. For quota purposes, directories count as files.

    Further, there are two different limits: hard limits and soft limits. Hard limits cannot be exceeded, ever. Any attempt to use more than your hard limit will be refused with an error. Soft limits, on the other hand, can be exceeded temporarily. You can stay over your soft limit for a certain period of time (the grace period). If you remain over your soft limit for more than the grace period, the soft limit is enforced as a hard limit.

    You will not be able to add or extend files until you get back under the soft limit. Usually, this means deleting unneeded files or copying important files elsewhere (perhaps the lou archival storage system) and then removing them locally.

    When you exceed your soft limit you will begin getting daily emails reminding you how long until the grace period expires. These are intended to be informative and not a demand to immediately remove files.

    Disk Space Quotas on Columbia /nobackup file systems

    You should have a scratch directory in /nobackup1a-h, /nobackup2a-g, or /nobackup3a-d. This filesystem is part of a Storage Area Network (SAN) and can be seen on every compute host in your domain, the front end, and on your mass storage server.

    There are also the local nobackup filesystems, /nobackup1-24, which can only be seen by that compute host.

    As the names suggest, these filesystems are not backed up, so any files that are removed can't be restored. Essential data should be stored on Lou1-3 or onto other more permanent storage.

    • There is a 200 GB soft limit and 400 GB hard limit on disk space in each /nobackup disk that you have access to. If you exceed the soft quota, an email will be sent to inform you of your current disk space and how much of your grace period remains. It is expected that a user will exceed their soft limit as needed, however after 14 days, users who are still over their soft limit will have their batch queue access to Columbia disabled.
    • If an account has been disabled for more than 14 days, then its Columbia data will be moved to the archive host, lou, and kept there for 6 months before removal, unless the project lead requests to have the data moved to another account.
    • If an account no longer has batch access to a node, then all data from that node should be moved off within 7 days (or sooner if the other project need the space).
    • If an account needs larger quota limits then send an email justification to support@nas.nasa.gov. This will be reviewed by the HECC Deputy Project Manager, Bill Thigpen, for approval.

    Disk File Quotas on lou

    • There is no quota for file space on Lou1, Lou2 or Lou3 because the data is written to tape. There is a quota on the number of files you can have. Currently there is a soft limit of 250,000 files and a hard limit of 300,000 files.
    • There is a 14 day grace period if soft limit is exceeded. An email will be sent to inform you of your current disk space and how much of your grace period remains. It is expected that a user will exceed their soft limit as needed, however after 14 days, users who are still over their soft limit will be unable to archive files until they have reduced their use to below the soft limit.
    • If an account needs larger quota limits then send an email justification to support@nas.nasa.gov. This will be reviewed by the HECC Deputy Project Manager, Bill Thigpen, for approval.
    • The maximum size of a file moved to lou should not exceed 30% of the size of your home filesystem on Lou. If you need to move files larger than this, please contact the help desk (support@nas.nasa.gov) for assistance.


    Portable File Names and Sizes

    Portable File Names

    Use portable file names. A name is portable if it contains only ASCII letters and digits, `.', `_', and `-'. Do not use spaces or wildcard characters or start with a `-' or `//', or contain `/-'. Avoid deep directory nesting.

    If you intend to have tar archives to be read under MSDOS, you should not rely on case distinction for file names, and you might use the GNU doschk program for helping you further diagnose illegal MSDOS names, which are even more limited than Unix like operating system.

    Portable File Sizes

    Even though lou's archive filesystem will allow a file size to be greater than several hundred gigabytes, not all operating systems or filesystems can manage this much data in a single file. If you plan to transfer files to an old Mac or PC desktop you may want to verify the maximum filesize it will support. Likely a single file will need to be less than 4 GB before it will transfer successfully.


    Slow File Retrieval

    There are sometimes problems with commands on Lou, that should finish quickly, but end up taking a long time.

    When you do an " ls " on Lou, you see all the files on disk that you've put there. However, most of the files are actually written to tape using SGI's Data Migration Facility (DMF).

    One problem with DMF is that it does not deal well with retrieving one file at a time from a long list of files. If you do an " scp " with a list of files, Unix feeds those files to DMF one at a time. This means that the tape(s) containing the files is getting constantly loaded and unloaded which is bad for the tape and tape drive, and also very slow. As the list of files gets longer (by use of "*" or moving a "tree" of files) the problem grows to where it can take hours to transfer a set of files that would only take a few minutes if they were on disk. When several people do file transfers at once that retrieve files one at a time, it can tie the system in knots.

    Optimizing File Retrieval

    DMF let you fetch files to disk as a group with the " dmget " command. The tape is read once and gets all the requested files in a single pass. Essentially, give dmget the same list of files you are about to transfer, and when the dmget completes, then scp/ftp/cp the files as you had originally intended. Or you can put the dmget in the background and run your transfer while dmget is working. If any files are already on disk, dmget sees this and doesn't try to get them from tape.

    There is also a dmfind command that let you walk a file tree to find offline files to give to dmget . Make very sure you are in the correct directory before running dmfind . Use the " pwd " command to determine your current directory.

    Please check to make sure too much data isn't brought back online at once by using du with the --apparent-size option or by using /usr/local/bin/dmfdu .

    Note that dmfdu will give an error message for each symbolic link that points at a nonexistent file.

        lou# /usr/local/bin/dmfdu Foo
        Foo
                 13 MB regular            340 files
               1114 MB dual-state        1920 files
              74633 MB offline           2833 files
                 13 MB small              340 files
              75761 MB total             5093 files
    
    

    When transferring data between lou and Columbia nodes use the /nobackup filesystems, instead of the Columbia NFS (slow) home directories.

    File transfer rates vary depending on the load on the system and how many users are transferring files at the same time. Transferring files using scp between Lou and Columbia nodes on the /nobackup file system for files larger than 100 megabytes is typically between 7 - 17 MB/s using the gigabit network interface. Transferring files using scp between the Columbia nodes for files larger than 100 megabytes, is typically between 20-30 MB/s for the gigabit network interface.

    Example 1:

    lou.user1%  dmget *.data &
    
    lou.user1%  scp -qp *.data  myhost.jpl.nasa.gov:/home/user/wherever
    

    Example 2:

    lou.user1%  dmfind  /u/user1/FY2000 -state OFL -print | dmget &
    
    lou.user1%  scp -rqp /u/user1/FY2000 some_host:/nobackup/user1/whereever
    
    

    You can see the state of a file by doing " dmls -l" instead of " ls -l". For more information on using DMF, please look at:

    http://www.nas.nasa.gov/Users/Documentation/DMF-Commands.html

    Maximum Amount of Data to Retrieve Online

    The online disk space for Lou1-3 is much, much less than its tape storage capacity, and it is impossible to retrieve all files to online storage at the same time. So, before retrieving a large amount of data, you should check that there is enough online space for it. The df command shows the amount of free space in a filesystem. The lou script dmfdu reports how much total (online and offline) data is in a directory. To use this script, simply " cd " into the directory you want to know total amount of data for all the files in the current directory and execute the script.

    If you would like to know the total amount of data under your home directory, you need to first find out if your account is under s1a-s1e, s2a-s2e or s3a-s3e. Assuming you are under s1b, you can then use dmfdu /s1b/your_userid to find the total amount. Another alternative is to simply cd to your home directory and use " dmfdu * ", which will show use for each file or directory.

    Lou1-3's archive filesystems are between 8 TB and 30 TB in size, but the available space typically floats between 10% to 30%. In the example 3, 29% of space is unused. It is best to retrieve at most 10% of the filesystem space at a time. Do what you need to with those files (scp, edit, compile, etc), then release ( dmput -r ) the space, and then retrieve the next group of files, use them, then release the space, etc. For example 3, retrieve one directory's data from tape, copy the data to remote host then release the data blocks, before retrieving more data from tape.

    If this process is not done, then it is very likely the filesystem will become full and the retrieval from tapes and file transfers to the remote hosts will fail for everyone trying to use same filesystem.

    Example 3:

    lou.user1%  df -lh .
    Filesystem            Size  Used Avail Use% Mounted on
    /dev/lxvm/lsi_s1b     8.6T  6.1T  2.6T  71% /s1b
    
    
    
    lou.user1%  dmfdu project1 project2
    project1
              2 MB regular            214 files
             13 MB dual-state           1 files
         229603 MB offline            101 files
              2 MB small              214 files
         229606 MB total              315 files
    
    project2
              7 MB regular            245 files
           4661 MB dual-state          32 files
         218999 MB offline             59 files
              7 MB small              245 files
         223668 MB total              336 files
    
    
    lou.user1%  cd  project1
    
    lou.user1%  dmfind  .  -state  OFL  -print   |  dmget  &
    
    lou.user1%  scp  -rp  /u/user1/project1  remote_host:/nobackup/user1
    
    (Verify that the data has successfully transferred)
    
    lou.user1%  dmfind  .  -state  DUL  -print   |  dmput  -rw
    
    lou.user1%  df  -lh  .
    
    
    lou.user1%  cd  ../project2
    
    lou.user1%  dmfind  .  -state  OFL  -print   |  dmget  &
    
    lou.user1%  scp  -rpq  /u/user1/project2  remote_host:/nobackp/user1
    
    lou.user1%  dmfind  .  -state  DUL  -print   |  dmput  -rw
    


    Maximum File Size Policy

    Lou's archive filesystems are between 8 TB and 30 TB in size, but small files (currently, those under 1 MB) consume up to 500 GB of disk space. The small files are normally always online, therefore reducing the total amount disk cache by .5 TB.

    An excessively large file (greater than 20% of your Lou home filesystem size) can cause the system to thrash. This is especially true of a tar or cpio file.

    If you have a very large file(s) to create or transfer to lou, call or e-mail the help desk (support@nas.nasa.gov) to have the staff work with you to avoid causing problems for yourself and other users.


    Collecting Small Files into Single Large Files

    The DMF archival system is optimized for the storage of large files (within the limit mentioned above). The tar and cpio programs allow you to collect multiple files and directory trees into a single archive file for storage on lou. You can then extract any or all files from this collection as needed. For example, you might create a tar file of all the sources for a program on a particular date and save that tar file on lou. At a later date, you could retrieve the file and extract the sources to rebuild the program as it existed on that date. As another example, a program might run through multiple timesteps, producing an image file at each step. Rather than store the individual images on lou, you can combine them, plus other files related to the run, into a single tar file.

    Note: If you will need individual files from a collection on a frequent basis (e.g., daily/weekly), it is probably better to store the files separately. Before you can extract a specific file from a collection, the entire tar archive has to be retrieved into online disk storage.

    If you are going to create a tar/cpio archive from a directory, you can't determine how much data is in the directory by using the du command since any data that is on tape may not be counted.

    GNU tar examples

    There are multiple versions of tar installed on lou and the columbia systems. For these examples, you want to use the version in /usr/local/bin. (As of this writing, it is GNU tar 1.15.1.) To make sure you get this version, you can move /usr/local/bin to the front of your PATH:

    host.user% set path=(/usr/local/bin $path ) # For csh or tcsh
     
    host.user% PATH=/usr/local/bin:$PATH #For sh, bash, etc.
    

    Running tar on lou

    You can run tar on lou as well as on the Columbia systems, but when creating tar files you need to use dmget to retrieve all the files that will be included in the archive (See Optimizing File Retrieval). In the last part of the example 7, you might not want to transfer the whole archive file to columbia when all you need is the PBS output file. In that case, login to lou, and proceed as in example 7, but omit the scp step. Again, be sure to set your PATH to use the tar in /usr/local/bin.

    Tar Create Collection

    Collecting sources for a program into a single tar archive. Assume you have a directory on columbia, called "xyz/sources" that contains all the sources and other files needed to compile program xyz. You want to create a tar archive of that directory, create a Table of Contents and store them on lou.

    Example 4:

    columbia.user% cd xyz
    
    columbia.user% set date=`date +%Y%m%d`
    
    columbia.user% set tarfile=src_$date.tar
    
    # Create the tar file
    
    columbia.user% tar -cf $tarfile sources
    
    # Create a Table of Contents
    
    columbia.user%  tar  -tf  $tarfile  > $tarfile.TOC
    
      #Verify the tarfile size matches the directory and the Table of Contents has
      #the same number of lines as there is files in the directory.
    
    columbia.user%  ls -l $tarfile ; du -sk sources
    -rw-r--r-- 1 user1 group1 2.1M Jul  7 14:51 sources.tar
    2.1M    sources
    
    columbia.user%  find sources | wc -l ; wc -l $tarfile.TOC
    112
    112 sources.tar.TOC
    
    columbia.user%  scp  $tarfile  lou:somewhere/$tarfile
    
    columbia.user%  rm  $tarfile      #  Don't  need  columbia  copy  any  more
    
    

    The "date" and "tarfile" shell variables are used to give the tar file a name based on the output from the date command. By convention, tar files end with the suffix ".tar" If you ran these commands on September 29th, 2005, the file would have the name "src_20050929.tar".

    The tar command is run with the -c and -f path options. -c says to create a tar archive. -f gives the path to the archive to create. You'll always want to use the -f option, because the default archive location is usually a physical device, such as a tape drive or floppy disk.

    Tar Extraction

    Now, let's say at some later time you wanted to get this version of the sources back.

    Example 5:

    columbia.user% mkdir xyztmp
    
    columbia.user% cd xyztmp
    
    columbia.user% scp lou:somewhere/src_20050929.tar .
    
    columbia.user% tar -xf src_20050929.tar
    
    columbia.user% cd sources
    
    columbia.user% make ...
    
    Here, the -x option says to extract items from the archive. Again, -f path is used to specify the tar file.

    Note: When we created the tar file, we used a relative path to name the directory we wanted to archive (we used just "sources," rather than the full path "/u/user/xyz/sources"). This way, we can extract the files into a different location, just by starting the extract from a different directory ("xyztmp" in the example). If we were in the original xyz directory, the extract would have replaced the current sources subdirectory with the old one, which is usually not what you want.

    Example 6:

    Collecting all files related to a particular job into a tar archive. For this example, assume you have a program that, among other output, creates several JPEG image files, at various timesteps during its run. You start each run from a different directory, to keep all the files from each run separate from other runs. However, the program also creates large checkpoint files (with suffix ".chk") that you do not want to include in the archive.

    columbia.user% cd pgsimruns
    
    columbia.user% mkdir run52
    
    columbia.user% cd run52
    
    # Copy/create PBS script and program input files
    
    columbia.user% qsub ... run.sh
    
    [Wait for job to complete.]
    
    columbia.user% set files=`ls -1 | egrep -v '\.chk|\.tar' `
    
    columbia.user% tar -cf run52.tar $files
    
    columbia.user% scp run52.tar lou:somewhere/run52.tar
    

    We use ls and egrep to build a list of files in the directory, omitting the checkpoint files. We also omit any .tar files. It is a common mistake to include the tar file itself in the list of files to be collected.

    Note: The argument to ls is -1 (one), not -l (ell), to list one file per line.

    Tar list

    Later, say you want to go back and review how the job ran. You want to examine the PBS job output file. By now, though, you've forgotten the job name. So, you first use tar to produce a table of contents of the archive, then extract just the job output file.

    Example 7:

    columbia.user% scp lou:somewhere/run52.tar /tmp/run52.tar
    
    columbia.user% tar -tvf /tmp/run52.tar
    
    -rwxr-xr-x user/group  35 2005-09-30 13:53:18 run.sh
    
    -rw-r--r-- user/group 20193 2005-09-30 14:10:13 pgsim.in
    
    -rw-r--r-- user/group 409600 2005-09-30 14:12:46 step0.jpg
    
    -rw-r--r-- user/group 409600 2005-09-30 14:12:52 step10.jpg
    
    -rw-r--r-- user/group 409600 2005-09-30 14:12:58 step20.jpg
    
    ...
    
    -rw-r--r-- user/group 409600 2005-09-30 14:14:17 step150.jpg
    
    -rw-r--r-- user/group   5738 2005-09-30 14:14:23 pgsim.out
    
    -rw------- user/group   5372 2005-09-30 14:14:26 run.sh.o8880
    
    -rw------- user/group      0 2005-09-30 14:14:26 run.sh.e8880
    
    columbia.user% tar -xf /tmp/run52.tar run.sh.o8880
    

    The first tar uses the -t option to indicate that you want a table of contents, and -v to get a verbose listing.


    CPIO Examples

    CPIO is another tool like tar used to create collections of files into a single file. The newer GNU versions of cpio and tar have been updated to be able to read both (cpio and tar) formats. In the past the primary portable archive file format was cpio.

    CPIO create

    Cpio is another program that can collect sets of files into a single archive file. Its major advantage over tar is that you can specify the files to be archived on stdin, rather than on the command line. A significant difference from tar is that if tar is asked to archive a directory, it archives the directory contents also. Cpio must be told explicitly to archive each item in a directory.

    Example 8:

    Let's take example 4 above and use cpio instead of tar to archive a directory of program source files and create a Table of Contents.

    columbia.user%  cd  xyz
    
    columbia.user%  set  date=`date  +%Y%m%d`
    
    columbia.user%  set  cpiofile=src_$date.cpio
    
    columbia.user%  find  source  -print  |  cpio  -o  -c  >  $cpiofile
    
    columbia.user%  cat $cpiofile  |  cpio  -it   >  $cpiofile.TOC
    
      #Verify the cpiofile size matches the directory and the Table of Contents
      #has the same number of lines as there is files in the directory.
    
    columbia.user%  ls -l $cpiofile ; du -sk sources
    -rw-r--r-- 1 user1 group1 2.1M Jul  7 14:51 sources.cpio
    2.1M    sources
    
    columbia.user%  find sources | wc -l ; wc -l $cpiofile.TOC
    112
    112 sources.cpio.TOC
    
    
    columbia.user%  scp  $cpiofile  lou:somewhere/$cpiofile
    
    columbia.user%  rm  $cpiofile
    
    

    We use the find command to generate the list of all files in the source directory. The -o option to cpio says to output an archive, and the -c option says to create it in a more compatible format.

    CPIO extraction

    The equivalent steps to restore the contents of the cpio archive are:

    Example 9:

    columbia.user% mkdir xyztmp
    
    columbia.user% cd xyztmp
    
    columbia.user% scp lou:somewhere/src_20050929.cpio .
    
    columbia.user% cpio -ic < src_20050929.cpio
    

    The -i option says to input the archive (usually from stdin, as in this example). As before, -c says the archive is in the more compatible format.

    Example 7: Converting Example 5 to use cpio is similar. The goal is to archive most, but not all, files in a directory. Note that we don't need example 5's $file variable because we can pipe the ls/egrep output directly into cpio.

    columbia.user% [ Same as example 5 through waiting for job to complete ]
    
    columbia.user% ls -1 | egrep -v '\.chk|\.cpio' | cpio -oc > run52.cpio
    
    columbia.user% scp run52.cpio lou:somewhere/run52.cpio
    

    Note: As in example 5, the argument to ls is -1 (one), not l (ell).

    Cpio is given the -o and -c arguments, to output the archive in compatible format.

    CPIO list

    The cpio method for listing the table of contents of an archive and extracting a single file is:

    Example 10:

    columbia.user% scp lou:somewhere/run52.cpio /tmp/run52.cpio
    
    columbia.user% cpio -ictv < /tmp/run52.cpio
    
    -rwxr-xr-x   1 user1 grp1         35 Sep 30 13:53 run.sh
    
    -rw-r--r--   1 user1 grp1      20193 Sep 30 14:10 pgsim.in
    
    -rw-r--r--   1 user1 grp1     409600 Sep 30 14:12 step0.jpg
    
    -rw-r--r--   1 user1 grp1     409600 Sep 30 14:12 step10.jpg
    
    -rw-r--r--   1 user1 grp1     409600 Sep 30 14:12 step20.jpg
    
    
    -rw-r--r--   1 user1 grp1     409600 Sep 30 14:14 step150.jpg
    
    -rw-r--r--   1 user1 grp1       5738 Sep 30 14:14 pgsim.out
    
    -rw-------   1 user1 grp1       5372 Sep 30 14:14 run.sh.o8880
    
    -rw-------   1 user1 grp1          0 Sep 30 14:14 run.sh.e8880
    
    25767 blocks
    
    columbia.user% cpio -icm run.sh.o8880 < /tmp/run52.cpio
    

    In the first cpio, the -i and -t arguments, combined, say we want to list a table of contents. The -c option, again, enables compatibility mode, although cpio can usually automatically detect which format was used in creating an archive. Verbose output is selected with -v.

    In the second cpio, the -m flag says to restore the modification times of extracted files to their values when they were archived. Otherwise, the modification times would be the time when cpio performs the extract.

    Note: When extracting files from an archive, cpio will not overwrite existing files by the same name. Add the -u option if you want to replace such files.

    A nice feature of cpio is that the list of files to extract is really a list of file name patterns. So, in the last example, because we know the format of PBS output files, we could skip listing the table of contents and let cpio find the right file for us:

    columbia.user% cpio icm 'run.sh.o*' < /tmp/run52.cpio
    


    Resend: File Copy Retry on Remote System Outages or Interrupts

    Using scp to transfer lots of files or over 50 GB of data could likely take over an hour to copy the data to Lou from a Columbia node. After starting the transfer, and seeing it is working, you might walk away for a while, but after several minutes into the file transfer there was a problem (Lou crashed, network problem, filesystem full, etc) that caused a file transfer failure and the rest of the files would likely not be transferred either. Normally it takes less than 20 minutes to reboot the host and other problems are typically resolved in less than hour.

    If possible use tar or cpio to gather the files into a single file to transfer to lou. If this is not possible then use the resend tool.

    There has been a new tool developed to help with transferring lots of files or large amount of data to Lou. It is called resend and is located in /usr/local/bin on Lou and all Columbia nodes.

    The purpose of resend is to send files and directories to a remote node and on failure of a file try to resend the file up to -r times and waiting -t minutes on first failure then doubling the wait time on each additional failure for a file.

    Default values are: -r 10 repeats, -t 3 minutes and the max wait time (-M) is 60 minutes between retries.

    Use resend interactively or from a small batch job (4p) to make effective use of the project CPU allocation. This tool is used to drop in front of scp to transfer files.

    Example 11:

    columbia.nas.nasa.gov % ls -l
    
    total 20889816
    
    drwx--x---+   2 user1   grp1               14 Aug 19 14:53 acl
    
    drwx------    2 user1   grp1             4096 Sep 26 13:31 junktar
    
     
    
    columbia.nas.nasa.gov % resend scp junktar lou:.
    
    Success: mkdir -p junktar
    
    Success: scp -q  junktar/f2m.1 lou:.
    
    Success: scp -q  junktar/f2m.2 lou:.
    
    Success: scp -q  junktar/f2m.3 lou:.
    
    Success: scp -q  junktar/f2m.4 lou:.
    
    Success: scp -q  junktar/f2m.5 lou:.
    
    Success: scp -q  junktar/f2m.6 lou:.
    
    Success: scp -q  junktar/f2m.7 lou:.
    
    scp: junktar/f2m.8: Permission denied
    
    Failed: Resending:  scp -q  junktar/f2m.8 lou:.
    
    { resend automatically waiting three minutes and trying again }
    
    Success: scp -q  junktar/f2m.8 lou:.
    
    Success: scp -q  junktar/f2m.9 lou:.
    
     
    
    + ... { rest of the files }
    




USA.gov -- government made easy
+ Feedback
+ Site Help
+ NASA Privacy Statement, Disclaimer, and Accessibility Certification
Click to visit the NAS Homepage
Editor: Jill Dunbar
Webmaster: John Hardman
NASA Official: Rupak Biswas
+ Contact NAS

Last Updated: May 5, 2009