hpss

Since 2/12/13 01:55 pm

lens

Since 2/13/13 10:20 am

smoky

Since 2/13/13 08:05 am
OLCF User Assistance Center

Can't find the information you need below? Need advice from a real person? We're here to help.

OLCF support consultants are available to respond to your emails and phone calls from 9:00 a.m. to 5:00 p.m. EST, Monday through Friday, exclusive of holidays. Emails received outside of regular support hours will be addressed the next business day.

pltar

Bookmark and Share


Description

pltar is another tar utility. It allows for the preservation of Lustre meta-data during archive, as well as restoring that meta-data upon extract and listing it via the -t -v options. Like spdcp, it can employ multiple clients to perform its functions. Depending upon various factors, multiple clients may be used on each file in the archive and/or to process multiple files in the archive in parallel. On the Cray XT hardware, the clients are run on compute nodes. Note that pltar must always be run within the context of a batch job. In this regard, pltar is different from spdcp.

Use

pltar ostensibly is a replacement for the standard tar command. However, it is not yet complete. This is a very early pre-release of the utility. Additional new parameters are provided to assist in performance tuning.

For example, a double buffering scheme is used to boost I/O performance for parallel filesystems. But a certain amount of tuning of the buffer size can enhance performance. To elaborate, there are many things that can affect the performance of parallel I/O. The size of the read request has a big effect on the performance of pltar. But, is it more efficient to read the file(s) in 48 MB chunks or in 192 MB chunks? From a meta-data server viewpoint, the latter is more efficient because it reduces meta-data traffic. However, {and you just knew this was coming} because parallel filesystems cannot satisfy all requests uniformly nor even all parts of a single request uniformly throughout the request, it can be more efficient to read the file in 48 MB chunks. Since we are (in effect) holding up the write request until the read request completes, it can be more efficient to use 48 MB reads and stage more requests than to wait on a single large request to complete. Furthermore, the striping of the archive impacts the performance of the archival and parameters are provided to specify this on creation.

This has not even arrived at the complicated part yet. The ability of tar and pltar to create compressed (bzip2) archives makes this more complicated yet because the target archive does not have predetermined mapping; i.e., while we can compress members in parallel sections, until the compression completes, we do not know the size of the compressed block of data. So, communication between clients is needed to correctly write the compressed archive. In this case, an even smaller buffer size may be better, as in 2 full stripe widths of the archive or 2*stripe size*stripe count. (If we get much smaller, the overhead of the read/write requests can become excessive.) The parallel extraction from a compressed archive has not yet been implemented, but the data constraints require that similar communication be used.

To use pltar, use the following invocation template within your batch job or interactive batch session:

module load pltar
pltar [options] Source

To be more specific, the following example is provided:

module load pltar
pltar -d 48 -x -f mytarball.tar
pltar --file-stripe-size=1048576 --file-stripe-count=9 -d 96 
      -c -f mynewtarball.tar {List of directories}

More information on pltar is available via its help function:

module load pltar
pltar --help

Unfortunately, module must be run within a batch job or an interactive batch session to define the parallel environment for pltar.

Available Versions




System Application/Version Build
Smoky pltar/1.0.0 ompi1.4.2_pgi10.4.0