NOTICE:
This Legacy journal article was published in Volume 3, May 1993, and has not been
updated since publication. Please use the search facility above to find regularly-updated information about
this topic elsewhere on the HEASARC site.
|
The HEASARC's Newly Consolidated
Anonymous FTP Account
Steve Drake and Bruce O'Neel (HEASARC)
The HEASARC has a new DecSystem 5000 computer that will act as its main data
server. Its Internet node name is legacy.gsfc.nasa.gov, corresponding to the
(present) IP address 128.183.8.233; since the IP address may change at some
point in the future, it is always better to use the node name, whenever
possible. legacy is destined to become the HEASARC's main interface with
the user community. The HEASARC On-line Service is currently being ported to
this ultrix machine. In July 1993, legacy will replace NDADSA as the
gateway to the HEASARC data archive. NDADS will, from that point on, only be
used as a data archive facility. We have already established an anonymous ftp
account on legacy and this article discusses how to use this facility,
the data that are presently available, and the data that will be available in
the near future. It should be noted that all anonymous ftp accounts intended
for public access on other OGIP[1] machines
such as rosserv will be phased out over the course of the next year, and their
resident data files moved over to legacy. For the interim period we
will have some duplication, in that data may be available on both
legacy and other computers.
Notice that legacy may also eventually become a DECNET node so the files
in its anonymous ftp account can be copied by users on other DECNET nodes in
the standard (DECNET) way.
What is an anonymous ftp account?
The File Transfer Protocol (ftp) is a utility allowing the rapid transfer of
information in the form of files between a source computer (in our case,
legacy) and the user's home computer. An anonymous ftp account is one
that anyone can access by logging in as anonymous (or as
ftp). (The user should remember that unix and ultrix machines are
case-sensitive.) The user is then prompted to give his or her e-mail address
(e.g., jones@node.dept.inst.edu) as a `password', and then the user is
permitted into the account. In general, both the source and the user's
computers have to be on the INTERNET. Further information on ftp can be found
in the useful summary of public software given by E.D. Feigelson, and F.
Murtagh, F. in PASP, vol. 104, p. 574. (1992).
The legacy anonymous ftp account
The particular type of ftp server installed on legacy is a friendly or
verbose one that automatically types out welcome headers when the user
initially logs on and when you cd into a sub-directory for the first time. If
you experience difficulties when you access the anonymous ftp account on
legacy (e.g., your session gets frozen), it might be because your
particular ftp server has trouble communicating with such a verbose ftp server
as legacy. If you suspect this is happening, you can suppress all the
automatic welcome screens generated by the latter simply by placing a minus
sign (-) in front of your password when you log in. Another useful feature of
the legacy ftp server is that it automatically logs all commands, so
that statistics of the usage of our system and the relative demand for the
various data and software products can be readily compiled.
To access the legacy anonymous ftp account from your computer (which, of
course, has to have its own ftp server), you type:
> ftp legacy.gsfc.nasa.gov
You will get the ftp prompt. Follow the log-in instructions, giving
anonymous as the username, and your e-mail address as the password.
You will then be greeted by a `Welcome' screen which scrolls out automatically,
and presents some basic information on the account, and may have some specific
additional comments on changes and/or updates to its status, and/or the status
of some of the archival datasets. The user is now free to explore the contents
of the ftp account using unix-like commands such as
> cd sub_dir
to change directory to the sub-directory called sub_dir, or
> cd ..
to move back up one level in the directory structure, or
> pwd
to find out what the present (working) directory is. The entire suite of ftp
commands are listed using ?. To find out more about any individual
command, e.g., get, type help get. To find out the contents
of the present directory, type ls or dir. The latter command
is somewhat more informative then a straight ls command. The result of
typing dir in the top level directory of the legacy anonymous
ftp account is shown below.
legacy.GSFC.NASA.GOV> dir
<Opening ASCII mode data connection for /bin/ls.
total 23
drwxrwxr-x 6 415 340 512 Mar 15 16:38 .caldb
-rw-r--r-- 1 root 345 17 Jan 28 15:42 .login
-rw-r--r-- 1 3T 345 1324 Mar 11 09:25 .message
-rw-r--r-- 1 377 345 1906 Mar 11 09:28 README
drwxrwxr-x 2 root 345 512 Feb 9 11:41 ariel5
drwxrwxr-x 3 root 305 512 Mar 17 10:00 asca
drwxrwxr-x 12 227 335 512 Feb 25 15:40 bbxrt
drwxrwxr-x 2 ftp system 512 Mar 1 15:23 bin
drwxrwxr-x 2 root 345 512 Feb 9 09:51 compton
drwxrwxr-x 2 root 345 512 Feb 9 11:42 cosb
drwxrwxr-x 2 ftp system 512 Oct 7 11:31 dev
drwxrwxr-x 2 root 345 512 Feb 8 16:32 documents
drwxrwxr-x 5 root 310 512 Feb 26 09:58 einstein
drwxrwxr-x 2 ftp system 512 Oct 7 11:23 etc
drwxrwxr-x 2 root 345 512 Feb 9 11:42 exosat
drwxrwxr-x 3 root 360 512 Feb 13 08:54 ginga
drwxrwxr-x 2 root 345 512 Feb 8 16:32 retrieve
drwxrwxr-x 10 root rosat 512 Feb 16 16:50 rosat
drwxrwxr-x 2 ftp system 512 Oct 7 11:30 shlib
drwxrwxr-x 4 ftp system 512 Feb 13 08:10 software
drwxrwxr-x 2 root 345 512 Feb 9 11:42 vela5b
<Transfer complete.
The column on the far right gives the file or sub-directory name. If the first
character on a line is a "d", then the entry is a sub-directory. If not, it is
a file. Another useful datum for any entry is the size in bytes: this is given
in column 5 of the "dir" listing (just before the date and time that the entry
was last modified).
As can be seen from this particular example, most of the entries in the top
directory are actually sub-directories, two of the entries that are actual
files are .message (which contains the welcome message that appears after
log-in) and README (which is essentially an expanded description). If the user
is in some doubt as to what he or she wants, or what the contents of a given
directory or directory tree are, the user should type
> get README
which will copy the README file back to their own computer, where it can be
typed out and/or printed out. There should be a README and a .message file in
most of the top-level directories and subdirectories. The only exceptions are
directories used by ftp for its own purposes such as bin, dev, etc., and shlib.
For the bottom-level subdirectories like "rosat/pspc/images/fits" the README
and .message file in the parent directory "rosat/pspc/images" tell the user
what the contents and formats of the files in the "/fits" and "/ps"
subdirectories are.
The basic structure of the legacy anonymous ftp account is shown in
Figure 1. The name of each top-level directories describes its function.
Figure 1. legacy anonymous ftp basic structure.
top-level
____________________________________|____________________________________
| | | | | |
software documents retrieve `mission' .caldb 'other'
`other': where `other' stands for one of several system sub-directories (bin,
dev, etc, shlib) that will normally be of no interest to the general user.
software: for general software packages such as XSPEC;
documents: for general documents like Users Guides, PROS cookbook, etc.;
retrieve: where people who have used the BROWSE facility on legacy will
go to find the data products that they have previously extracted, and then ftp
them back to their own machine;
`mission': where `mission' stands for one of the following missions for which
we will have data available: rosat, einstein, exosat, bbxrt, compton, asca,
ariel5, cosb, ginga, and vela5b;
.caldb: the calibration database. For practical reasons, we have set up a
distinct directory tree for calibration data. The easiest way for a user to
access calibration data is in the tree for that specific mission, but some
users are interested in obtaining specific calibration data and might prefer to
go straight to the .caldb directory. (Note: either way the user is accessing
the same physical calibration files).
Each `mission' directory has what is hopefully a well-defined, well-explained
(in the README files), logical structure. Since we are creating the
legacy ftp account both out of new databases and those presently
resident in other ftp accounts, and also since some projects have their own
individual quirks and peccadillos, it is difficult to enforce a completely
uniform `standard' format on all the missions. Figure 2 is an example of a
mission directory tree.
Figure 2. ROSAT example of mission directory tree
rosat
_________________________________|__________________________________
| | | | | | | |
calib_data doc problems nra_info data timelines software publications
For each subdirectory for which it is relevant, the next level down is split
into instrument-specific subdirectories. This is only of relevance for
multiple-instrument missions like Einstein and ROSAT. Thus, for the data
subdirectory, the following (or logically equivalent) hierarchical structure
is normally followed: mission level, instrument level, data-type level, and
data-format level.
Figure 3. ROSAT example of subdirectory hierarchy
data
_______|________
| |
instrument pspc hri
_________________ ________________
| | | | | | | |
data-type images images
_________ _________
| | | |
data-format ps fits ps fits
Again, we have attempted to use self-explanatory key-words for the various
data-types such as "images", "spectra", "rates", and "events", but there may be
some deviations from these standards.
Data formats
Much of the raw data and data products are made available in the form of FITS
(Flexible Image Transport System) files. William Pence has discussed our
implementation of this strategy in Legacy, 1, 14. As part of
this activity, the HEASARC has been developing the FTOOLS software package that
consists of a generic set of utilities with which FITS files can be
manipulated: this sofware package is available in the legacy anonymous
ftp account (check the sub-directory software/ftools). Note that when
ftp-ing a FITS file, the user should first type "binary" so that the utility is
configured to send a binary file.
There are many data formats available through this anonymous FTP account as
follows.
(i) Many of the text files (such as the README files) are in plain ASCII. These
can be directly ftp-ed using the default ASCII mode.
(ii) Much of the data are available as PostScript (PS) plot files. The user
needs to have a printer that supports PostScript for these to be of any use.
These (plain ASCII) files can be copied over to the user's computer using the
ftp get command, and then a hard copy can be made following the usual
PostScript plot procedures. For example, in the area
rosat/data/pspc/images/ps, the user can find PostScript files to
create grey-scale images of ROSAT PSPC observations.
(iii) Much of the data are available as (unix-) COMPRESSED files. These are
indicated by the suffix .Z at the end of the file name. In general, a user
needs to have a computer with a unix operating system, in order to easily
UNCOMPRESS .Z files, although many non-unix machines now have special utility
programs that can translate .Z files. Whenever a COMPRESSed file is to be
copied, remember to set the transmission mode to binary, by typing
binary. Compressing a file helps to speed up the ftp-ing of large
amounts of data, which is why it is such a nice feature. The legacy ftp
software can also compress and decompress files "on the fly". For instance,
suppose there is a long ASCII file named "Bob" that you want to get,
then if you type:
> get Bob.Z
ftp will first COMPRESS Bob, and then send it to your computer. In the opposite
fashion, if you cannot UNCOMPRESS .Z files on your home computer, and you want
to copy a file in a legacy directory called Bill.Z, the simply type:
> get Bill
and ftp will first UNCOMPRESS Bill.Z, and then send you the resultant file.
(iv) Some of the data files are available as "tar" files. These are files
containing a group of seperate sub-files that have been collected together
using the unix tar utility. Again, for these to be of use to a user, the user
has to have the facility to handle "tar" files on his or her home computer.
(v) Some of the files (generally documentation type) are in Tex or Latex
format. Since these are simply ASCII files with embedded Tex commands, these
can always be printed out (after being ftp-ed to the user's home computer) as
plain text files.
The above formats are not, of course, exclusive. For example, in the directory
"bbxrt/tar/by_observation", the user will find files such as
"bbxrt.n4151o.tar.Z" that are compressed (.Z) tar (.tar) files. Perusal of the
README file for bbxrt reveals that each such ".tar.Z" file contains a set of
individual files that are all themselves in FITS format.
What data do we presently have in the legacy anonymous ftp
account?
Since this is a sensitive function of time, no attempt will be made to give a
comprehensive list of the datasets in the anonymous ftp account. In a future
issue of Legacy, when the format becomes more stable, a more complete
description of the contents of the account will be given. The best way to find
out what is in the anonymous ftp account now is to ftp to it and check it out
for yourself. Some of the highlights are:
(i) The Broad-Band X-Ray Telescope (BBXRT) Archive: See the articles by A. P.
Smale in Legacy, 2, 17 and in this issue of Legacy for
further details.
(ii) The ROSAT Archive: See the article by M. F. Corcoran, M. Duesterhaus,
and K.L. Rhode in Legacy, 2, 9 for further details.
(iii) The Ariel-5 and Vela 5B All-Sky Monitor Databases: See the article by L.
Whitlock, J. Lochner, and K. L. Rhode in Legacy, 2, 25 for
further details.
(iv) The Ginga Large Area Counter (LAC) summary data files: See the article by
B. Perry in Legacy, 1, 30 for further details.
(v) The Compton Observatory Archive: See the article by T. McGlynn et al. in
Legacy, 2, 4 for further details.
(vi) Einstein data products from its various instruments (IPC, HRI, SSS, FPCS,
MPC, and IPC-Slew) have been created from all the SAO and HEASARC Einstein
CD-ROMs.
(vii) The COS-B Archive: This contains FITS files of the 65 pointed
observations and associated calibration data for the gamma-ray observatory
COS-B, and will be described in more detail in an article by P. Barrett in the
next issue of Legacy.
How to identify which data you really want
This is often the hardest part of using an anonymous ftp account, and
essentially involves knowing in which particular file or files the particular
observation in which the user is interested has been placed. The easiest way to
discuss this is by use of examples, but the reader should be aware that these
examples are mission-specific.
In a later issue of Legacy, it will be discussed how to use the
anonymous ftp account in conjunction with the BROWSE software that is scheduled
to be installed on the legacy machine in mid-1993, and to become
publically available shortly thereafter. For the present article, two alternate
methods will be discussed:
(1) Using BROWSE to determine the existence of useful data, and then ftp-ing to
legacy;
(2) Doing it all via ftp.
Example 1: Getting SSS Files
Suppose I want to examine Einstein Solid State Spectrometer (SSS) spectra of 3C
120. Using method (1), I first access the XRAY account on NDADSA, and
BROWSE SSS. I a search on the co-ordinates of this object (the
sc command) and am informed that the SSS made 6 observations of this
object. I examine the spectra in BROWSE using XSPEC, and decide that I want all
of the spectra on my own machine for more detailed analysis. I do a "dall" on
each spectrum, and find that they have (root)file names like "sc120a",
"sc120b", etc., at least on NDADSA. Notice that these filenames bear some
resemblance to the target in this example. This is not always the case in other
databases. Having found the filenames, I now ftp to the legacy
computer, log in as anonymous, and cd to the directory
einstein/data/sss/spectra, and am immediately discouraged because
ls sc120* finds no such files in this area. This is because there are
some file name inconsistencies between the databases on NDADSA and
legacy (which will be resolved when BROWSE moves to legacy). In
this case, it turns out that the initial "s" in the SSS file name has been
omitted in the version on legacy: thus, if I do ls c120* I will
at once find the files that I was looking for.
Using method (2) is somewhat less contorted since it only involves accessing
legacy, but it does involve me hunting around a bit more in the einstein
directory tree. Thus, I do the usual anonymous ftp into legacy, and
cd to einstein/doc/sss where I find lots of files with
useful-sounding names like sss.cat. I get 1 or 2 (or
mget all) of these files to my own computer. I peruse sss.cat
and find that it has a listing of all the sss observations as shown below:
(yy.dd) (sec) (rate) (hh mm ) (o ' ")
Name Time Expos Count Ice start RA(1950) DEC(1950) File name
-------------+-------+-----+-------+---------+--------+---------+---------
1H 0240+621 79.032 7864 0.21 1.83 02 41 01 62 15 27 Q0241
1H 0334+098 79.233 2785 0.97 0.81 03 35 57 09 48 32 O335096A
1H 0334+098 79.234 8028 0.96 0.82 03 35 57 09 48 32 O335096B
2A 0430-615 79.207 7208 0.28 0.60 04 30 36 -61 32 60 O430615A
2A 0430-615 79.208 2621 0.29 0.98 04 30 36 -61 32 60 O430615B
2A 0430-615 79.208 4341 0.26 1.00 04 30 36 -61 32 60 O43061
2A 0922-317 79.135 409 -0.01 1.59 09 22 00 -31 42 00 X0922M
2A 1219+305 78.341 1884 0.28 3.06 12 18 52 30 27 14 H1219A
2A 1219+305 79.156 5406 0.39 1.21 12 18 52 30 27 14 H1219B
3C 120 79.049 7700 0.61 1.43 04 30 31 05 14 59 C120A
3C 120 79.070 7782 0.47 1.63 04 30 31 05 14 59 C120B
3C 120 79.232 8273 0.43 0.69 04 30 31 05 14 59 C120C
3C 120 79.233 5570 0.44 0.76 04 30 31 05 14 59 C120D
3C 120 79.247 1228 0.81 0.63 04 30 31 05 14 59 C120E
3C 120 79.248 4341 0.88 0.67 04 30 31 05 14 59 C120F
and thus I again determine that the SSS made 6 observations of 3C 120, and,
this time, I get the precise file names that correspond to them. I cd
to einstein/data/sss/spectra and type
> binary
> mget c120*
and the files will be sent to my own computer.
Example 2: Getting ROSAT Images
Now, suppose that I want to find out whether there is a ROSAT PSPC image of the
X-ray source 1H0551-819 available in the ROSAT Public Archive. Using method
(1), I BROWSE the relevant database (ROSUSPSPC; the equivalent one for the HRI
data is ROSUSHRI) inside the XRAY account on NDADSA. A search by name (always
risky) or a search at its coordinates (the safest technique) gives the
following result:
ROSUSPSP_PUBLIC_DEC > sc
R.A. (2000 d/f= 12 29 40.69 or 187.420): 6 12 44
Dec (2000 d/f= 24 31 14.16 or 24.521): -81 50 06
Radius arcmin (outer inner d/f= 60.00 0.00):
1
File Name Expos RA(2000) DEC(2000) Name Public date Public?
(sec) (hh mm s) (o ' ") (y.d) (YE/NO)
---------+-------+---------+----------+----------------+----------+------
1 RP300026 10123 06 12 44 -81 50 06 1H0551-819 93.016 YE
i.e., there is one observation of this object already in the public archive:
the relevant file name (root) is RP300026. I now do the anonymous ftp to
legacy, cd to rosat/data/pspc/images/fits, and type
>ls rp300026*
and I find that, sure enough, there are the 4 files that I want:
rp300026_im1.fits rp300026_im3.fits
rp300026_im2.fits rp300026_mex.fits
and I proceed to mget them safely home.
Using method (2), I directly connect via anonymous ftp to legacy. I
check my back issues of the HEASARC Journal, Legacy, and find the
article describing the ROSAT Public Data Archive by Corcoran et al. in
Number 2 on page 9. This fine article tells me where to find the lists of
public ROSAT Position Sensitive Proportional Counter (PSPC) and High Resolution
Imager (HRI) data. One caveat: because of our recent attempt to standardize the
individual databases, the paths given by Corcoran et al. are obsolete. Thus,
the lists of PSPC data (ppublic_data.pos, ppublic_data.seq,
and ppublic_data.date for the listings sorted by position, sequence
number, and public release date, respectively) can now be found in
rosat/data/pspc/doc, while the similar lists of HRI data
(hpublic_data.pos, hpublic_data.seq, and
hpublic_data.date) are now in rosat/data/hri/doc. Since, in
this example, I am interested in a PSPC observation, I cd to
rosat/data/pspc/doc, get the file ppublic_data.pos,
examine it (e.g., by searching on the coordinates or (more riskliy) on the
name, and find:
300026|GSFC| |930116 | 6 12 44.000| -81 50 6.000|10123|1H0551-819|BUCKLEY
The first column of this table is the ROSAT ror number: for an unfiltered,
US-processed PSPC observation, the filename (kernel) is the ROR number with the
initial prefix "rp". Thus the filename for the observation in which I am
interested is rp300026.... I now cd to
rosat/data/pspc/images/fits as before, and type (using an initial as
well as a final wild card "*" just to be on the safe side):
>ls *300026*
and I find the same 4 files that I found using method (1) which I can now
mget.
If there are no files for that particular ROR presently available, check the
public release date for the observation (column 3 in the above entry from the
ppublic... file); if it is less than 2 to 4 weeks ago, the data have not yet
been transferred to legacy, and the user should wait a week or two
before re-checking the archive.
Because of the sheer volume of the low-level ROSAT data products (of order 100
Gigabytes per year), the full ROSAT data archive presently resides on the NSSDC
computer NDADSA. If after looking at the image of 1H0551-819, the user is
intrigued enough to want to do a full analysis of this observation, please
refer to Section 2.2 of the Corcoran et al. article for a description of
how to request NSSDC for the complete dataset for a given observation.
Conclusions and caveats
This system is still being developed, so we will give periodic updates on the
status of the data archives and the software. For example, we hope shortly to
support the GOPHER facility, which (if your home computer also has this
utility) makes an anonymous ftp account much more user-friendly by allowing the
user to examine the contents of a file or file(s) without having to transfer
them to his own computer. We will alert users in the Welcome Message to the
anonymous ftp account whenever we make major changes and/or enhancements to it
such as making GOPHER available. We welcome comments and suggestions to improve
this service: they should be sent to drake@lheavx.gsfc.nasa.gov (Internet) or
LHEAVX::DRAKE (DECNET).
Finally, a few words of caution: be careful of wild cards as some of the
directories may contain many, many files comprising hundreds of Megabytes of
data. It is not prudent to do "mget *" on directories. If you need entire data
archives, contact the HEASARC and we will discuss the most efficient way of
getting such large data volumes to your home site.
Proceed to the next article
Return to the previous article
Select another article
HEASARC Home |
Observatories |
Archive |
Calibration |
Software |
Tools |
Students/Teachers/Public
Last modified: Monday, 19-Jun-2006 11:40:52 EDT
|