NERSL NPRA Legacy Data Archive Home Page

NPRA Legacy Data Archive

 

Introduction

The National Petroleum Reserve, Alaska, (NPRA) Legacy Data Archive represents one of the largest geological and geophysical data sets held by the U.S. Geological Survey (USGS).  From 1944 to 1953 the U.S. Navy operated a large-scale exploration of the then Naval Petroleum Reserve No. 4, drilling 36 test wells and 45 core tests.  A second, more extensive exploration program was operated between 1974 and 1982.  Run first by the U.S. Navy and later the USGS, this exploration program collected over 12,000 line miles of seismic data and drilled 28 exploratory wells.  Both these exploration programs generated a vast amount of data (digital and analog), analyses, and documents that are being captured, inventoried, cataloged, and archived to Compact Disc-Recordable media and made available over the Internet.

.  Initially distributed to the public by the National Oceanographic and Atmospheric Administration's (NOAA) National Geophysical Data Center, the entire data set was returned to the USGS in 1993.  The size and content of this data set presented the USGS, the agency responsible for maintaining and distributing this information to the public, with some fundamental data storage and distribution problems.

 

 

A Brief History of the NPRA

·         23 million acre area, approximately the size of the state of Indiana.

·         Largely unexplored until the early 1900's.

·         1923 President Harding formed Naval Petroleum Reserve 4

·         1923-1926 Initial surveys by the USGS for the Dept. of the Navy.

·         1926-1943 Little exploration done.

·         1943-1953 PET-4 oil and gas exploration by the Dept. of the Navy.

§         45 shallow core test wells.

§         36 test wells.

·         1974 Dept. of the Navy initiated a 5 year contract with Husky Oil NPR Operations to manage the exploration program.

·         1976 Naval Petroleum Reserve 4 transferred to the Dept. of the Interior and renamed the National Petroleum Reserve Alaska.

·         1977 Exploration program responsibilities transferred to the USGS 1982 NPRA exploration program terminated.

§         28 test wells drilled.

§         Over 12,000 line miles of seismic data collected.

§         Almost $1 Billion dollars spent.

·         1980's - 1993 NOAA stored & distributed NPRA data to the public.

·         1993 NOAA returned all materials to the USGS.

·

 

Data Storage Problems

In 1993, the NPRA data set and related storage problems consisted of:

 

·         Magnetic Tapes

ü       8,000 9- and 21-track tapes.

ü       Required 8,000 cu. ft. of expensive conditioned storage.

ü       Most tapes close to or over 20 years old.

ü       Only one copy of many tapes.

·         Documents

ü       Thousands of pages stored in many cardboard boxes or map cases.

ü       Cardboard boxes or map cases storage spread over several different locations.

ü       Only one copy of many documents.

·         Paper or Film Displays

ü       Thousands of maps, well logs, and seismic data displays stored in many cardboard boxes or map cases.

ü       Cardboard boxes or map cases storage spread over several different locations.

ü       Only one copy of many documents.

·         35mm Color Slides

ü       Hundreds of 35mm color slides of well core.

 

The factors presented above created additional data storage problems:

 

·         Related data types stored on different media were stored in different physical locations.

Physical media, not the relationship of the information stored on those physical media, dictated where and how it was stored.  For example, the demultiplexed seismic data, stored on 3,800 9-track tapes, required 14,000 pages of observers' logs and survey notes in order to be processed to final form.  The 3,800 9-track tapes were stored in the tape library.  The 14,000 pages of documentation were stored in cabinets or boxes in two different buildings.  The potential for loss was great.

 

·         The physical bulk of the data prevented it being stored in a single location.

The sheer bulk of the physical media required the data to be stored in several different physical locations, raising access, maintenance, and security issues.

 

 

Data Distribution Problems

Having been collected and processed using public funds, the data in NPRA Legacy Data Archive has been made available, on loan, to anyone requesting it.  In order to answer a data request, personnel would collect, inventory, and package for shipment the requested data.  The data requestor would pay for shipping and reproduction of the requested data.  After making the copies, the data requestor would ship the data back where it would be received by USGS personnel and returned to storage.  The original data set presented the following data distribution problems:

 

·         Labor Intensive

Getting the data in and out of storage was very labor intensive.  USGS personnel had to locate it, collect it, recreate an inventory, package it, and arrange for shipping.  When the data were returned, the process had to be reversed.

 

·         Time Consuming

The sheer volume of some data requests would take months to satisfy.  At some point in the past eight years, every one of the 3,800 demultiplexed seismic data 9-track tapes and the associated 14,000 pages of documentation have been distributed multiple times in the above fashion in response to data requests.

 

·         Potential for Data Loss

There was one copy of many data media.  If that copy was damaged or lost while on loan, it would be lost for good.  Magnetic tapes presented a special case in that the tape medium itself was deteriorating due to age and the recording material was physically coming off the tapes with every use.  Eventually the data are gone.

 

·         Single Person / Single Use Data

In many cases, there was only one physical copy of a data item, greatly restricting access to the data.  This meant that only one person (organization) at a time could access any physical data item in the archive.

 

Using CD-ROM Technology to Solve Storage and Distribution Problems

The NPRA Legacy Data Archive is a current and on-going project.  The following characteristics of recordable CD-ROM (CD-R) technology is helping solve the data storage problems of the NPRA Legacy Data Archive:

 

·         Multiple data types may be stored together on a CD-R, allowing logical collections of information.

·         CD-R's may be stored under normal office conditions.  No expensive conditioned storage is required.

·         CD-R's greatly increase audience, making the data available in any computer having a CD-ROM reader.

·         CD-R's are quickly and easily copied, allowing multiple copies to be easily made and stored in different locations as a disaster contingency.

·         CD-R's are random access, unlike linear magnetic tape, providing quick access to the data.

·         CD-R's are capable of storing 640 Mbytes of information.

 

Using CD-R technology the data storage problems of the NPRA Legacy Data Archive have been addressed by:

 

·         Capturing magnetic tape data to CD-R media.

·         Scanning documents, logs, and maps to digital images and capturing those images to CD-R media.

·         Converting the 35mm well core slides to Kodak PhotoCD and JPEG images, and capturing those images to CD-R.

·         Inventorying the captured data, creating databases of content.

·         Combining related data components from the data capture CD-R's, and appropriate documentation, into logical collections to produce archive CD-R's.

·         Making multiple copies of the CD-R's to be stored in different locations as a disaster contingency.

·         Storing the data capture and archive CD-R's in robotically accessed CD-ROM jukeboxes attached to a network, allowing access to the data.

 

Once data are captured to CD-R, storage space is dramatically reduced.  For example, the demultiplexed seismic data archive, originally consisted of 3,800 9-track seismic data tapes and 14,000 pieces of paper documents, required over 400 cu. ft. of storage, 380 cu. ft. of which was conditioned magnetic tape storage.  Archived to 254 CD-R's, the same volume of data requires less than 1/2 a cu. ft. of storage space in standard office conditions.

 

Using CD-R technology to solve data storage problems also solved the associated physical data distribution problems:

 

·         Data has been organized into logical collections, inventoried, and stored on CD-R's, removing or reducing the previously lengthy preparation process in answering a data request.

·         The reduced volume of data recorded on CD-R makes collecting and packaging the data for distribution a much smaller task.

·         Multiple copies of archive CD-R's exist, protecting against data loaned out either getting damaged or lost.

 

Perhaps the greatest benefit of using CD-R technology is that data stored on CD-R’s can be placed in CD-ROM jukeboxes and the contents of those jukeboxes can be made accessible over the Internet.  This transforms previously single person / singe use data to information available to multiple users simultaneously 24 hours a day, 7 days a week, with little or no human intervention.

 

NPRA Legacy Data Archive Web Site

The NPRA Legacy Data Archive web site, http://nerslweb.cr.usgs.gov, is designed to use CD-ROM technology for data storage and the data distribution capabilities of the Internet for data access.  This web site makes available the following data types:

 

SEISMIC DATA

·         SEG-B and SEG-Y format seismic data.

·         Image files of associated seismic data documentation.

·         Tables of seismic data field collection parameters.

·         Seismic data location information, both ASCII text files and image files of location maps.

·         Image files of paper/Mylar seismic data displays.

 

WELL LOG DATA

·         Image files of paper/Mylar well log displays.

·         LAS-format digital well log data.

·         Image files of well data, analyses, and reports.

·         Image files of the 35mm color well core slides.

 

PUBLICATIONS

·         Out of print, hard to find older publications.

·         Newer digital publications, generally of reprocessed data.

 

 

The design characteristics of this web site are:

 

·         Database-driven Web Site

Simple web forms allow the user to interactively query the inventories of seismic and exploratory well data stored in MS-Access database tables.  The results of the queries submitted are returned to the user’s browser as dynamically generated HTML containing links to the data described in the inventories.  Being database-driven offers several fundamental advantages:

 

1.       Web pages are generated dynamically from the results of query against the database tables comprising the archive’s inventories.  If the information in the NPRA Legacy Data Archive web site were presented through static HTML it would require an estimated 3,500+ HTML pages.  Dynamically creating web pages as a result of database queries currently requires 9 Active Server Pages for the seismic data portion of the archive and 4 Active Server Pages for the exploratory well data.  As a result, web site maintenance is greatly reduced.

2.       Web content is a product of the archive’s data inventories stored in MS-Access database tables.  Should the content of the database tables change (which they do as the data are archived), the changes are automatically delivered to the user through the dynamically created web pages.  Again, maintenance is greatly reduced, as only the database tables must be maintained, not both the database tables and the HTML pages serving that information.

 

·         Standard HTML

The web pages generated and returned to the user are standard HTML requiring no special software or browser plug-ins, and are thus viewable by any web browser.  Certain data types, such as Adobe Acrobat PDF files or MrSID compressed images (used for the location maps) do require browser plug-ins to view the files on-screen, but access is through standard HTML.

 


·         Industry Accepted, Standard Software

The web site is housed on a PC running the Microsoft Server 2000 operating system and served using the Microsoft Internet Information Server web site software and Active Server Pages.  The Microsoft Access database management system is used to maintain the various archive inventories and is accessed by the web server in response to queries submitted through the site’s web pages.

 

·         Transportable between computers if necessary.

Using industry-standard software, the web site is transportable between comparably configured computer systems.

 

 

Supporting the Evolving Mission of the USGS

The mission of the USGS, as paraphrased in the National Academy of Sciences, National Research Council (NRC) study Future Roles and Opportunities for the U.S. Geological Survey (National Academy Press, 2001), is to “supply information contributing to the wise management of natural resources and that promotes the health, safety, and well-being of the nation’s citizens.”  This report further states the USGS is evolving into a natural resource and information agency taking a leadership role in the use of modern technology for the effective and efficient dissemination this information.  The NPRA Legacy Data Archive supports these statements by taking hard to store, hard to access, single person / single use information and, using current technologies, makes this information accessible 24 hours / day, 7 days / week, from anywhere in the world having an Internet connection.

 

 

Conclusion

The NPRA Legacy Data Archive is one of the largest geophysical and geological data sets in the USGS, costing close to $1 Billion dollars, in 1970's money, to collect and process.  Left in its original format, these data would essentially be non-existent, as access to it was either difficult or impossible.  Using CD-R technology, these data have been, and are currently being, archived to a stable, space-efficient, easily accessed and reproduced medium.  This CD-R archive is being made Internet-accessible using fairly inexpensive, reliable, readily available hardware and software.  Using this technology, the USGS is transforming information which was originally single person / single use at best, into information available to many simultaneous users worldwide, 24 hours per day, 7 days per week.

 

NERSL NPRA Legacy Data Archive Home Page