Overview of ARM Archive User Interface
(2/5/98; revised 09/04/2009 Stefanie Hall)
The ARM Archive contains more than 2,200,000 user accessible data files formatted in more than 2000 types of data streams. (The total data volume of the Archive is more than 6,500,000 files and 38 terabytes.) The user interface for the Archive is designed to facilitate the identification of specific ARM data files that should be retrieved for a data user's request without going through numerous, very long, lists of obscure filenames. The magnitude of the ARM data collection requires that data be stored in a Mass Storage System (MSS: a collection of computers and automated tape libraries containing 1000's of tape cartridges). Because the data files are not 'on-line', the user interface processes 'directory' information from an on-line database to identify the availability of data files. A schematic of the Archive can be seen here. Secondary processing by the Archive computers copies requested data files from the MSS to an accessible FTP site. Users are notified by e-mail when all requested files are available at the FTP site. Accessibility to the data files is completed when the user has copied the files via FTP to their own system. Processing of requests greater than 25 GB (~10,000 files) is suspended until the Archive staff confirm the availability of online storage.
The following sections provide additional information on:
- The computing capabilities needed to access the Archive and use ARM data
- A logical overview of the Archive User Interface
- User Interface choices
- Data Browser Interface
- Catalog Interface
- Thumbnail Browser
- Statistical Browser
- IOP Data Browser
Presumed computing capabilities by Archive interface users
Users of the ARM Archive interface and retrieved data files are presumed to have the following computing capabilities:
- A WWW browser (the interface is designed and tested for Netscape 4.0 or higher; other web browsers appear to be okay as well)
- required to view the user interface
- helpful for accessing ARM documentation (http://www.arm.gov/)
- An e-mail address (required for retrieval notifications)
- tools for Internet file transfer by way of FTP
- very large requests can be transmitted by tape (contact armarchive@ornl.gov or call 1-888-ARM-DATA for assistance)
- system acceptance of long filenames (ARM filenames range from ~20-64 characters)
- netCDF or HDF tools
- compilers [C or Fortran] for incorporating public domain subroutines into user written software or
- commercial applications for analyzing netCDF (e.g., IDL, ATLAB)
Logical Flow of the User Interface
The logic of the user interface includes the following steps:
- Login to interface
- This step enables the interface to track your request specifications and notify you when your files are retrieved.
- specify your username or email address, if you have previously registered
- register a username, if you are a new user
- we need to know an e-mail address for notification of successful file retrievals.
- name, address, and phone number also provides important information for contacting you and characterizing the ARM data user community.
- Review request status or specify new request
- Select Interface type
- Data Browser Interface
- Specify files to be requested with exact specifications for site, date range, instrument or measurement type, and facility.
- Catalog Interface
- Browse tables of data availability summarized by location, year, instrument type, etc. and select data in monthly increments
- Thumbnail Browser
- Browse daily thumbnails and quicklooks of files with specifications for site, date range, instrument, and measurement.
- Statistical Browser
- Browse a series of drill-down statistical graphs for showcase datasets with the option to extract more statistical information or order ARM data files.
- IOP Data Browser
- Review Intensive Operational Period (IOP) data stored in an online, documented directory tree and download files individually or build collections of files as a TAR file.
- Select ARM data
- enter query specifications in data browser interface
- select entries from the catalog interface
- download or "check" items in IOP data browser
- Review data selection results and submit retrieval request
- Each interface displays and estimate of the number of files and bytes contained in the request
- Review
- Specify additional requests or logoff the interface
- This is the end of an interactive session with the user interface
- Users are notified by e-mail when the requested files are accessible from online storage.
- A secondary computer program supervises the copying of the requested files from the Mass Storage System to the user accessible FTP storage.
- Requests greater than 25 MB (sum of file sizes) or 10,000 files are suspended until the availability of FTP storage is confirmed by Archive staff.
- The time required to complete the retrieval of files from the MSS depends on:
- The number of files requested (e.g., >5000 files may require a few hours to complete)
- The number of other requests pending in the retrieval 'queue'.
- Review data notifications
- description of data quality report system
- request for credit and publications
- Use FTP to download data files (follow link in notification message)
- connect to ftp.archive.arm.gov
- enter username: armguest
- enter email address as FTP password
User Interface choices
The Archive provides five online user interfaces for the specification of files that need to be accessed by a data user. The user interfaces accomplish the same function - facilitate user access to the data files -, but support complementary solutions to finding the files that you want from the 5,000,000+ files stored in the Archive. Summary descriptions of the user interfaces are:
- Data Browser Interface
- Identifies available data files from exact specifications of site, date range, instrument or measurement type, and facility, etc
- The Data Browser Interface provides an overview of ARM data quality. It displays daily quality color (green, yellow, red) for user specified subsets of sites, facilities, measurements and date ranges.
- The Data Browser can also provide detailed information
about Data Quality Reports and quick looks for user specified search
criteria
- Catalog Interface
- Supports browsing of summary tables (by combinations of year, site, data source, etc.) about file availability and the specification of data requests in one month increments
- Thumbnail Browser
- Displays specified files in a thumbnail format for browsing and viewing quicklooks in greater detail.
- Statistical Browser
- Allows users to view statistical plots of showcase datasets, and then drill down through time scales ranging from the full period of record to individual months.
- Provides the option to extract the data behind the statistical graphs, obtain the measurements used to calculate the statistics, and order the ARM data files from which the measurements were obtained.
- IOP Data Browser
- Provides access to IOP data stored in an online, documented directory tree.
More information about these interfaces are provided in the sections below. Assistance with requests for data can also be submitted to the Archive User Services (email: armarchive@ornl.gov or phone 1-888-ARM-DATA or 1-865-241-4851).
Data Browser Interface
The identification of the requested data files is determined from a query to an online database representing the 'directory' of available files. Requested files are typically identified from queries related to site, time, instrument or measurement or data stream, and facility. Besides ordering files, users can view data quality information (such as Data Quality Report, Data Quality Color Calendar, Quick Looks) for the selected data streams and date ranges. The queries for user-defined selections of files are based on the following three logical pathways
1) Novice Interface (Show Figure):
- Site:
- data must be selected from one geographic site per request
- Date range:
- starting and ending dates for the query must be specified
- This is explicitly specified by the user by way of month, day, and year selections for each date criteria.
- Search Path:
- Instruments or Measurements
- Instruments or Measurements Category:
- One or more categories can be selected
- Instruments or Measurements:
- one or more Instruments or Measurements can be selected within the selected category
- Facilities:
- List of facilities are displayed based on the previous selection criteria (specific to site, date range, category, and instruments or facilities)
- One or more facilities can be selected from all the available facilities
- Files to order:
- A list of files is displayed based on the selected search criteria
2) Datastream Interface (Show Figure):
[Datastream Interface is equivalent to the data streams options found in the previous Power User application].
- Site:
- data must be selected from one geographic site per request
- Date range:
- starting and ending dates for the query must be specified
- This is explicitly specified by the user by way of month, day, and year selections for each date criteria.
- Data Streams:
- List of available data streams based on the selected site and date range are displayed
- Files to order:
- A list of files is displayed based on the selected search criteria
Additional information about these query options is provided
in the table below.
For a step-by-step tutorial on using the Data Browser, click here.
Query option | type of logic | user efficiency | User actions | limitations |
Novice Interface: | ||||
Instrument | indirect
background filtering of the potential data stream list from user selected criteria for site, date range, instrument categories, instruments, facilities and highest data level |
high:
when searching for data
from specific instruments
low: when selecting data for a diversity of instruments |
selects site, date range, instrument categories, instruments, and facilities | lengthy list of instrument
names
presumes knowledge of the instrument's measurement capabilities |
Measurement | secondary, indirect
background filtering of the potential data stream list from user selected criteria for site, date range, measurement categories, measurements, facilities and highest data level |
high:
when searching for many
possible variations of a
measurement type
low: when searching for diverse set of unrelated measurements |
selects site, date range, measurement categories, measurements, and facilities | lengthy list of measurement
names
availability of measurements is confounded by site, date, facility, and data level criteria |
Datastream Interface: | ||||
Data streams | indirect
background filtering of the potential data stream list from user selected criteria for site, date range |
high:
when search for a few
specific data stream types
low: when selecting a diversity of data stream types |
selects site, date range, data stream names | presumes a working
knowledge of ARM data
stream name codes
requires scrolling a VERY long list of data stream names |
Catalog Interface
The catalog based user interface presents, in an interactive
sequence of tables, a hierarchical summary of available data files ( see Figure 1)
organized in a way
that will be useful to the inexperienced, as well as the expert Archive
user. In addition to leading the user to specifying a subset of data,
the intent of the catalog
is also to display the availability of the data. The availability of
data is irregular in time and space because of incremental changes in
the installation and operation
of the field sites (points of data generation). The content of the
table's cell values indicates the quantity of available data (number of
files) within the criteria
represented by each cell. Criteria combinations for which data are
available contain cell values greater than 0 and are linked to the next
subset levels.
Combinations containing no data display '0' and are not linked.
The navigation catalog metadata is combined with a "Data Cart"
concept for collecting file sets of particular interest. At any level,
the user may view the
contents of the data cart, remove file sets from the data cart,
or submit the list for retrieval from the Archive.
Description of the Interface
The ARM Archive catalog interface consists of two major
components: 1) a catalog of available
data files organized in a four level hierarchy, and 2) a data
cart collection scheme that allows the user to store,
edit and display a list of selected file sets. The interface programs
display a sequence of linked HTML tables
that allow the user to move through the various catalog levels,
converging to desired sets of files. The hierarchy includes links to
tables for increasingly narrow
subsets of the data collection (see Figure 1).
Selecting a value in each table leads to a table
showing more detail in the next step. After the fourth step, data may be selected for addition to the Data Cart.
The section below describes the user
interface at each of these levels.
Instructions
For a step-by-step tutorial on using the Catalog Browser Interface, click here.Step 1: Selecting the Site and Year
Following a login screen, the top level of the interface presents the number of files available in
the Archive grouped by site and year
(Figure
2). The user selects
a site and year by clicking on the corresponding number of files in the
table, assuming the number is nonzero.
Step 2: Selecting the Instrument
Category and Facility Type
This selection takes the user to the second level, Figure 3,
which displays all instrument categories
and types of facilities from which ARM
data were collected
for the site and year chosen on the previous page. From this level an
instrument category and facility type are chosen by clicking on the
number of files in the
appropriate cell of the table. Alternatively, the user may return to
level 1 (to change the previous selection) by clicking on "Year" or
"Site" at the top of the page.
Step 3: Selecting the Instrument and Data Level
The third level (Figure 4)
lists the number of available files by instrument code
and data level, for the previously
selected combination of site, year, instrument
category and facility type. The data level reflects the amount of
processing done on raw data. Instrument and data level codes are
briefly described below the
table. Again, options are available to return to levels 1 or 2 via
links at the top of the page.
Step 4: Selecting the Facility and Month
The final level (Figure 5)
in the hierarchy of metadata attributes allows the user to select file
sets by facility and month,
or return to one of the previous three levels.
Step 5: Adding Files to the Data Cart
After the selection of facility and month (by clicking on a nonzero number of files in the table),
the user is then presented a summary of the selections (Figure 6)
together with the number and total
size of the data files. At this point the user may elect to add these
files to the data cart, return to any of the previous interface levels to edit selections, or
continue browsing. Adding the
set of files to the data cart returns the user to the original catalog interface, with the selected
data added to "Current Selections" (Figure 7)
The user may then continue browsing and adding data selections to the Data Cart. Each time, the chosen
datastream will be added under "Current Selections." To remove a data selection, highlight the selection
and click "Remove Selected Streams."
Step 6: Ordering Selected Data Files
When the user is satisfied with a collection of file
sets, clicking "Proceed to Order" will bring up the selected data. The user may then elect to "Select All"
files, choose only certain files for ordering, or extract measurements (Figure 8). Clicking
"Order Files" will submit the user's request for the selected files. An "Order Confirmation" will then
be displayed (Figure 9).
Summary and Discussion
The catalog interface enables the ARM researcher to efficiently identify files of interest, determine the existence of data, and collect sets of data prior to submitting a retrieval request. Important aspects of the system described here include the assignment of descriptive instrument categories and the dynamic explanation of instrument codes. Collection of data sets is currently done at the facility/month level. The collection (data cart) may be listed and edited from any level.
Thumbnail Browser
Background
In November 2004, the Thumbnail Browser Interface joined the existing suite of user tools for finding and selecting data on the ARM Data Archive website. This relatively new interface provides users with a graphical view of the data files before they decide to request or download them for additional use. The Thumbnail Browser also offers the following advantages:Description
The first segment of the user interface enables the user to specify general criteria for "data of interest". The initial pages of the Thumbnail Browser are very similar to those of the Data Browser (select sites, facilities, date range, etc.). The user is provided three pathways to make these initial selections: Novice Interface, Power Interface, and Catalog Interface. These pathways are logically similar to the pathways in the Data Browser. Further steps in the Thumbnail Browser allow the user to view thumbnails of plots containing many of the primary measurements from ARM data streams. From the thumbnail views, the users can directly access larger-scale data plots called "Quick Looks", review Data Quality Reports, or select data files to be requested for retrieval and download. The selection of data files is made with graphical check boxes that allow selection by day or by multiple days for a single data stream or selection of all files for all datastreams.(Click here to see an example of the Thumbnail Browser results page)
The Thumbnail Browser provides the user with many options for customizing the thumbnail display within two thematic views:
Both of these views allow the user to easily view consecutive time ranges or days in a sequence. These thumbnail views may be saved for future reference or emailed as a link to other users to show them the same "views" and make additional specifications for their data requests.
Instructions
For a step-by-step tutorial on using the Thumbnail Browser Interface, click here.Statistical Browser
Background and Description
The Statistical Browser (also referred to as "statistical views") is the newest interface to be developed, and currently consists of pre-computed products for nested time ranges (whole period of record; annual; seasonal; and monthly - as appropriate). For each time range and measurement, a variety of simple statistics are computed. Graphs of the statistical distribution of measurements (e.g., histograms) are also linked to the actual statistics displayed in the graphs. The graphs are available through a web-based interface. Users select a location and measurement and then drill down through times scales ranging from the full period of record to individual months. In addition to viewing graphs displayed by the user interface, users are able to extract the data behind the statistical graphs, obtain the measurements that were used in calculating the statistics, and order the ARM data files from which the measurements were obtained. This interface currently contains statistical views for showcase datasets.
Instructions
Step 1: Begin by selecting an ARM site for which to view available statistical plots and summaries. Current ARM sites available in the Statistical Browser are: SGP, NSA, TWP, and HFE.
Step 2: Next, select a dataset. Current showcase datasets available are:
- ARM Surface Radiation Data (qcrad1long)
- Climate Modeling Best Estimate Data (CMBE)
- Long-Term Continuous Forcing Data from Variational Analysis (CONSTRVARANA)
Step 3: Select a facility from the list of those available.
Step 4: Select a measurement from the list of those available. This will display the available plot types for the selected measurement.
Step 5: Select a plot type from the list of those available. Plot types will vary based on dataset and measurement, ranging from daily to monthly to seasonal plots. Users can mouse-over the image in parentheses to see a description of each plot type. Clicking on the image will bring up a sample of that particular plot type.
Step 6: Select a date range by entering start month/year and end month/year, and click on "Get Plots" to view the plots.
Step 7: The plots will be displayed below the interface in thumbnail form. Users may click on any thumbnail image to view the detailed data plot and utilize additional features for accessing the data. These features are:
- Get Statistics: Available in Text, Excel, and XML formats. After choosing a format, the statistics will be displayed.
- Get Data for the Selected Range: Available in Text, Compressed Text(qz), Excel, and NetCDF formats. Once the format is chosen, a "Download Data" screen will appear while the request is being processed. This may take a few seconds. When the download is complete, follow the URL given to download the extracted measurement data.
- Get ARM Data Files: Select "Add to Cart" or "List Files" to order the ARM Data files chosen by the user.
IOP Data Browser
Background
IOP Intensive Operational Periods (IOPs) generate data that are "non-routine" because they originate from extra or guest data sources. The data may also be "non-routine" because the instruments are operated with temporary, experimental (non-production) protocols. All of these exceptions from normal operations causes significant "clutter" in the metadata and logic used in the query and catalog interfaces. Constraining the structure of the IOP data to follow the simple logic required to successfully manage the 5,000,000+ ARM data files, challenged the creativity of the ARM data managers and frustrated the IOP data generators (who are often guest collaborators with ARM and are not (or should not) fully indoctrinated with ARM-specific data management practices. The IOP Data Browser is also used for storage and access of reference data sets (e.g., geographic overlays of states, rivers, etc. for satellite images) and special data (e.g., preliminary versions of VAP output).
The IOP Data Browser was designed to provide the following features:
- It presented enough structure so that potential data users could follow an understandable path to identify and access the IOP data sets.
- It allowed for considerable flexibility in how the data were structured within an IOP.
- Minimal rules about names for sub-directories and files within each IOP
- Every subdirectory has a "readme" explaining its contents. The specifications for the readme are minimal, but links to more extensive web-based documentation are allowed.
- Minimal expectations for IOP data to follow similar naming or documentation
- It enables users to select and download a few individual files or a few individual sub-directories
- It enables the Archive to track "who accessed which data when" for reporting and update notification purposes.
Description
The IOP Data Browser contains a documented, online directory tree of IOP data. The IOP data are organized in a hierarchy of year / site / IOP / insturment - PI subdirectories. Additional subdirectories may be used within an IOP. Each subdirectory has a "readme" file to guide the user through that level's information. Data from IOPs may be downloaded as individual files by clicking on each file link. If the user needs to download large portions of IOP data (multiple files or subdirectories), a "check box system" (described in the outline below) can be used to select files and directories to be built into a single TAR file for download. The creation of the TAR file occurs after the end of an IOP browsing session and the user is notified by email when the TAR file is ready to download.
The IOP Data Browser presents a 3 section display:
- The top section displays the contents of the readme for the current subdirectory.
- This readme may link to additional information at other web sites generated or referenced by the IOP participants.
- The primary ARM documentation about IOPs is contain in a series of web pages located at: http://www.arm.gov/campaigns
- The ARM documentation has a directory structure that is similar to the one used for the IOP data
- Other web sites may be visited without losing your place in the IOP data structure.
- The middle section shows a traditional browser-based directory and file list than can be used to navigate the data collection.
- The top of this section shows the current directory path for "where am I".
- The main portion of this section lists directories and files within the current directory.
- Users may click on directory links to navigate to lower levels.
- Users may click on file link to open or download individual files.
- For some formats (e.g., netCDF), other information about the data files maybe displayed.
- Very large data files (e.g., cloud radar, WSI, etc.) may be stored in the Mass Storage System of the Archive.
- The readme information for these files will include information on how to find these IOP data in the Archive.
- Each directory or file link displayed has a "check box" on the left side to select data to be added to a TAR file.
- Clicking the check box for a file will add the file to a TAR file.
- Clicking the check box for a directory will add the entire contents of the directory (including the contents of lower subdirectories and files) to the TAR file.
- After sub-trees of the directory have been "checked", lower level files and sub-directories maybe unchecked as needed to specify the exact collection of IOP data to be included in the TAR file
- The bottom section shows information and options about the TAR file being specified for downloading multiple files and directories
- Lists of included directories are displayed (and can be removed as needed)
- Lists of excluded directories are displayed (and can be removed as needed)
- Option for "zipping" the TAR file can be selected
- Control buttons for submitting the request for TAR construction are located in the section.
Access and login to the IOP Data Browser
The IOP Data Browser can be access after a login to the Archive User Interface; or it can be accessed directly at http://iop.archive.arm.gov/arm-iop/. (The IOP Data Browser can also be accessed from links located throughout ARM IOP documentation; see web page located under http://www.arm.gov/campaigns). All attempts to access IOP Data Browser will request a web login requiring the entry of a username and password. The user should enter their Archive account name for BOTH the username and password. Although this login appears to be redundant, it enables the Archive record the user access of each file. The records of access are important for distributing notifications about future updates to IOP data and reporting statistics on the usage of IOP data.