NERSL NPRA Legacy Data Archive Home Page
NPRA Legacy Data Archive
Introduction
The
National Petroleum Reserve, Alaska, (NPRA) Legacy Data Archive represents one
of the largest geological and geophysical data sets held by the U.S. Geological
Survey (USGS). From 1944 to 1953 the
U.S. Navy operated a large-scale exploration of the then Naval Petroleum
Reserve No. 4, drilling 36 test wells and 45 core tests. A second, more extensive exploration program
was operated between 1974 and 1982. Run
first by the U.S. Navy and later the USGS, this exploration program collected
over 12,000 line miles of seismic data and drilled 28 exploratory wells. Both these exploration programs generated a
vast amount of data (digital and analog), analyses, and documents that are
being captured, inventoried, cataloged, and archived to Compact Disc-Recordable
media and made available over the Internet.
. Initially distributed to the public by the
National Oceanographic and Atmospheric Administration's (NOAA) National
Geophysical Data Center, the entire data set was returned to the USGS in 1993. The size and content of this data set
presented the USGS, the agency responsible for maintaining and distributing
this information to the public, with some fundamental data storage and
distribution problems.
A Brief
History of the NPRA
·
23 million acre area, approximately the size of the
state of Indiana.
·
Largely unexplored
until the early 1900's.
·
1923 President Harding
formed Naval Petroleum Reserve 4
·
1923-1926 Initial surveys by the USGS for the Dept.
of the Navy.
·
1926-1943 Little
exploration done.
·
1943-1953 PET-4 oil and
gas exploration by the Dept. of the Navy.
§
45 shallow core test
wells.
§
36 test wells.
·
1974 Dept. of the Navy
initiated a 5 year contract with Husky Oil NPR Operations to manage the
exploration program.
·
1976 Naval Petroleum
Reserve 4 transferred to the Dept. of the Interior and renamed the National
Petroleum Reserve Alaska.
·
1977 Exploration
program responsibilities transferred to the USGS 1982 NPRA exploration program
terminated.
§
28 test wells drilled.
§
Over 12,000 line miles
of seismic data collected.
§
Almost $1 Billion
dollars spent.
·
1980's - 1993 NOAA
stored & distributed NPRA data to the public.
·
1993 NOAA returned all
materials to the USGS.
·
Data
Storage Problems
In 1993, the NPRA data
set and related storage problems consisted of:
·
Magnetic Tapes
ü 8,000 9- and 21-track tapes.
ü Required 8,000 cu. ft. of expensive conditioned
storage.
ü Most tapes close to or over 20 years old.
ü Only one copy of many tapes.
·
Documents
ü Thousands of pages stored in many cardboard boxes or
map cases.
ü Cardboard boxes or map cases storage spread over
several different locations.
ü Only one copy of many documents.
·
Paper or Film
Displays
ü Thousands of maps, well logs, and seismic data
displays stored in many cardboard boxes or map cases.
ü Cardboard boxes or map cases storage spread over
several different locations.
ü Only one copy of many documents.
·
35mm Color Slides
ü Hundreds of 35mm color slides of well core.
The factors presented
above created additional data storage problems:
·
Related data types
stored on different media were stored in different physical locations.
Physical
media, not the relationship of the information stored on those physical media,
dictated where and how it was stored.
For example, the demultiplexed seismic data, stored on 3,800 9-track
tapes, required 14,000 pages of observers' logs and survey notes in order to be
processed to final form. The 3,800
9-track tapes were stored in the tape library.
The 14,000 pages of documentation were stored in cabinets or boxes in
two different buildings. The potential
for loss was great.
·
The physical bulk of
the data prevented it being stored in a single location.
The
sheer bulk of the physical media required the data to be stored in several
different physical locations, raising access, maintenance, and security issues.
Data
Distribution Problems
Having been collected and
processed using public funds, the data in NPRA Legacy Data Archive has been
made available, on loan, to anyone requesting it. In order to answer a data request, personnel would collect,
inventory, and package for shipment the requested data. The data requestor would pay for shipping
and reproduction of the requested data.
After making the copies, the data requestor would ship the data back
where it would be received by USGS personnel and returned to storage. The original data set presented the
following data distribution problems:
·
Labor Intensive
Getting
the data in and out of storage was very labor intensive. USGS personnel had to locate it, collect it,
recreate an inventory, package it, and arrange for shipping. When the data were returned, the process had
to be reversed.
·
Time Consuming
The
sheer volume of some data requests would take months to satisfy. At some point in the past eight years, every
one of the 3,800 demultiplexed seismic data 9-track tapes and the associated
14,000 pages of documentation have been distributed multiple times in the above
fashion in response to data requests.
·
Potential for Data
Loss
There
was one copy of many data media. If
that copy was damaged or lost while on loan, it would be lost for good. Magnetic tapes presented a special case in
that the tape medium itself was deteriorating due to age and the recording
material was physically coming off the tapes with every use. Eventually the data are gone.
·
Single Person /
Single Use Data
In
many cases, there was only one physical copy of a data item, greatly
restricting access to the data. This
meant that only one person (organization) at a time could access any physical
data item in the archive.
Using
CD-ROM Technology to Solve Storage and Distribution Problems
The NPRA Legacy Data
Archive is a current and on-going project.
The following characteristics of recordable CD-ROM (CD-R) technology is
helping solve the data storage problems of the NPRA Legacy Data Archive:
·
Multiple data types may
be stored together on a CD-R, allowing logical collections of information.
·
CD-R's may be stored
under normal office conditions. No
expensive conditioned storage is required.
·
CD-R's greatly increase
audience, making the data available in any computer having a CD-ROM reader.
·
CD-R's are quickly and
easily copied, allowing multiple copies to be easily made and stored in
different locations as a disaster contingency.
·
CD-R's are random
access, unlike linear magnetic tape, providing quick access to the data.
·
CD-R's are capable of
storing 640 Mbytes of information.
Using CD-R technology the
data storage problems of the NPRA Legacy Data Archive have been addressed by:
·
Capturing magnetic tape
data to CD-R media.
·
Scanning documents,
logs, and maps to digital images and capturing those images to CD-R media.
·
Converting the 35mm
well core slides to Kodak PhotoCD and JPEG images, and capturing those images
to CD-R.
·
Inventorying the
captured data, creating databases of content.
·
Combining related data
components from the data capture CD-R's, and appropriate documentation, into
logical collections to produce archive CD-R's.
·
Making multiple copies
of the CD-R's to be stored in different locations as a disaster contingency.
·
Storing the data
capture and archive CD-R's in robotically accessed CD-ROM jukeboxes attached to
a network, allowing access to the data.
Once data are captured to
CD-R, storage space is dramatically reduced.
For example, the demultiplexed seismic data archive, originally
consisted of 3,800 9-track seismic data tapes and 14,000 pieces of paper
documents, required over 400 cu. ft. of storage, 380 cu. ft. of which was
conditioned magnetic tape storage.
Archived to 254 CD-R's, the same volume of data requires less than 1/2 a
cu. ft. of storage space in standard office conditions.
Using CD-R technology to
solve data storage problems also solved the associated physical data
distribution problems:
·
Data has been organized
into logical collections, inventoried, and stored on CD-R's, removing or
reducing the previously lengthy preparation process in answering a data
request.
·
The reduced volume of
data recorded on CD-R makes collecting and packaging the data for distribution
a much smaller task.
·
Multiple copies of
archive CD-R's exist, protecting against data loaned out either getting damaged
or lost.
Perhaps the greatest
benefit of using CD-R technology is that data stored on CD-R’s can be placed in
CD-ROM jukeboxes and the contents of those jukeboxes can be made accessible over
the Internet. This transforms
previously single person / singe use data to information available to multiple
users simultaneously 24 hours a day, 7 days a week, with little or no human
intervention.
NPRA
Legacy Data Archive Web Site
The NPRA Legacy Data
Archive web site, http://nerslweb.cr.usgs.gov,
is designed to use CD-ROM technology for data storage and the data distribution
capabilities of the Internet for data access.
This web site makes available the following data types:
SEISMIC DATA
·
SEG-B and SEG-Y format
seismic data.
·
Image files of
associated seismic data documentation.
·
Tables of seismic data
field collection parameters.
·
Seismic data location
information, both ASCII text files and image files of location maps.
·
Image files of
paper/Mylar seismic data displays.
WELL LOG DATA
·
Image files of
paper/Mylar well log displays.
·
LAS-format digital well
log data.
·
Image files of well
data, analyses, and reports.
·
Image files of the 35mm
color well core slides.
PUBLICATIONS
·
Out of print, hard to
find older publications.
·
Newer digital
publications, generally of reprocessed data.
The design
characteristics of this web site are:
·
Database-driven Web
Site
Simple
web forms allow the user to interactively query the inventories of seismic and
exploratory well data stored in MS-Access database tables. The results of the queries submitted are
returned to the user’s browser as dynamically generated HTML containing links
to the data described in the inventories.
Being database-driven offers several fundamental advantages:
1.
Web pages are generated
dynamically from the results of query against the database tables comprising
the archive’s inventories. If the
information in the NPRA Legacy Data Archive web site were presented through
static HTML it would require an estimated 3,500+ HTML pages. Dynamically creating web pages as a result
of database queries currently requires 9 Active Server Pages for the seismic
data portion of the archive and 4 Active Server Pages for the exploratory well
data. As a result, web site maintenance
is greatly reduced.
2.
Web content is a
product of the archive’s data inventories stored in MS-Access database
tables. Should the content of the
database tables change (which they do as the data are archived), the changes
are automatically delivered to the user through the dynamically created web
pages. Again, maintenance is greatly
reduced, as only the database tables must be maintained, not both the database
tables and the HTML pages serving that information.
·
Standard HTML
The
web pages generated and returned to the user are standard HTML requiring no
special software or browser plug-ins, and are thus viewable by any web
browser. Certain data types, such as
Adobe Acrobat PDF files or MrSID compressed images (used for the location maps)
do require browser plug-ins to view the files on-screen, but access is through
standard HTML.
·
Industry Accepted,
Standard Software
The
web site is housed on a PC running the Microsoft Server 2000 operating system
and served using the Microsoft Internet Information Server web site software
and Active Server Pages. The Microsoft
Access database management system is used to maintain the various archive
inventories and is accessed by the web server in response to queries submitted
through the site’s web pages.
·
Transportable
between computers if necessary.
Using
industry-standard software, the web site is transportable between comparably
configured computer systems.
Supporting the Evolving Mission of the USGS
The mission of the USGS,
as paraphrased in the National Academy of Sciences, National Research Council
(NRC) study Future Roles and Opportunities
for the U.S. Geological Survey (National Academy Press, 2001),
is to “supply information contributing to the wise management of natural
resources and that promotes the health, safety, and well-being of the nation’s
citizens.” This report further states
the USGS is evolving into a natural resource and information agency taking a
leadership role in the use of modern technology for the effective and efficient
dissemination this information. The
NPRA Legacy Data Archive supports these statements by taking hard to store,
hard to access, single person / single use information and, using current
technologies, makes this information accessible 24 hours / day, 7 days / week,
from anywhere in the world having an Internet connection.
Conclusion
The NPRA Legacy Data
Archive is one of the largest geophysical and geological data sets in the USGS,
costing close to $1 Billion dollars, in 1970's money, to collect and
process. Left in its original format,
these data would essentially be non-existent, as access to it was either
difficult or impossible. Using CD-R
technology, these data have been, and are currently being, archived to a stable,
space-efficient, easily accessed and reproduced medium. This CD-R archive is being made
Internet-accessible using fairly inexpensive, reliable, readily available
hardware and software. Using this
technology, the USGS is transforming information which was originally single
person / single use at best, into information available to many simultaneous
users worldwide, 24 hours per day, 7 days per week.