Building Energy Science And Technology Digital Collections
For an Information Infrastructure For the Physical Sciences
   

Virtual Reference Desk (VRD) Conference, which was held in Seattle, Washington on Oct. 16-17, 2000

Karen J. Spence and Mary V. Schorn

Presentation viewgraphs

Abstract 

The U.S. Department of Energy Office of Scientific and Technical Information provides a suite of innovative digital information resources for a stronger America.  Included in these resources are world-class products that address the three main ways by which researchers disseminate their findings: the DOE Information Bridge (grey literature), PubSCIENCE (journal literature), and the PrePRINT Network (preprints).  These products are key components of the suite of resources provided through EnergyFiles, a virtual library of energy-related scientific and technical information.  Each product can be searched individually or in parallel with other energy-related resources using EnergyPortal, which is the groundbreaking distributed search mechanism of EnergyFiles.  This history of success lays the foundation for OSTI’s new initiative, a future Information Infrastructure for the Physical Sciences.

 Introduction 

The Department of Energy (DOE) is among the leading research agencies in the world, investing $7 billion annually in research and development (R&D).  It is vitally important for research agencies to disseminate their information as broadly and as quickly as possible, providing access to data and information that fuels essential knowledge.  For over 50 years DOE’s Office of Scientific and Technical Information (OSTI) has been collecting, preserving, and disseminating the Department’s scientific and technical information (STI).  By utilizing Information Age technologies, OSTI has radically changed its information services and has developed a suite of award-winning Internet resources that bring science information to the desktop at no cost to the user.  These resources provide easier, faster, cheaper, more complete, and more convenient means of accessing and using global STI by scientists, researchers, academia, industry, and the public. 

One-stop shopping access to this suite of resources is provided through EnergyFiles, a Web-based virtual library that provides easy access to collections of both DOE and worldwide energy-related scientific and technical information.  EnergyFiles contains a search mechanism, EnergyPortal, that is easy to use, integrates parallel searching, and retrieves information from heterogeneous and geographically dispersed databases and Web sites.  Key components of EnergyFiles are DOE R&D Project Summaries (current research), DOE R&D Accomplishments (outcomes of past research), the DOE Information Bridge (grey literature), PubSCIENCE (peer-reviewed journal literature), and the PrePRINT Network (preprints).  These products have been designed to provide remote access to billions of dollars of energy‑related research performed by DOE and its collaborators. 

DOE Information Bridge 

The DOE Information Bridge, made available in April 1998 in collaboration with the U.S. Government Printing Office (GPO), contains DOE report literature from 1995 forward.  It incorporates over 55,000 full-text reports comprising over 4.3 million pages.  It provides free, convenient, and quick access to full‑text DOE research and development reports in physics, chemistry, materials, biology, environmental sciences, energy technologies, engineering, computer and information science, renewable energy, and other topics.  Users remotely access and download the reports free of charge and in significant volume. 

The DOE Information Bridge focuses on providing access to scientific and technical reports produced by DOE, DOE national laboratories and DOE contractors.  New reports processed by OSTI are added routinely and legacy reports are added as resources permit.  Since its introduction, the content of DOE Information Bridge has more than doubled. 

DOE Information Bridge search options include a basic approach that can be concentrated on specific data fields and an advanced approach that includes Boolean operators to increase search precision.  Users can search the entire collection (full text and bibliographic data), or they can search portions of it.  Three formats, GIF, PDF and TIFF, are available for viewing the full‑text page images.  Two formats, PDF (image only) and the original input format, are available for downloading full‑text documents.  This makes reports far easier to use and eliminates the cumbersome and time-consuming practices associated with searching traditional media. 

Awards and recognitions received by the DOE Information Bridge include

          a commendation to DOE and GPO by the Depository Library Council to the Public Printer;

          receiving Vice President Gore’s National Performance Review Hammer Award;

          receiving the DOE Information Management Technical Excellence Award;

          being highlighted in the October 1, 1998, inaugural issue of Access America Online Magazine, a product of the Government Information Technology Services Board;

          being favorably reviewed in the University of Wisconsin's "Scout Report" for science and engineering; and

          being cited by the Global SchoolNet Foundation and Yahoo (Pick of the Day and Week). 

Building and expanding the DOE Information Bridge reinforces DOE’s and GPO’s commitment to make available DOE research reports and to move federal programs and activities into the ever-expanding world of the Information Age. 

PubSCIENCE  

PubSCIENCE was developed to facilitate searching and accessing peer-reviewed journal literature in the physical sciences and other disciplines of interest to DOE.  Made available in collaboration with the Government Printing Office (GPO) in October 1999, it provides for quick, easy, and free searching of a compendium of peer-reviewed journal citations and abstracts about the physical sciences and other energy-related disciplines.   Hyperlinks provide access to publisher servers to obtain full-text articles if the user or organization has a subscription to the journal.  If the user lacks such a subscription, access to the full text can be obtained by pay per view, by special arrangement with the publisher, by library access, or through commercial providers. 

Over twenty-five of the most prestigious publishers in the world today are represented in PubSCIENCE, which facilitates searching of and access to over 1.8 million records in over 1000 journal titles of peer-reviewed scientific and technical information.  

PubSCIENCE is the convergence of recent advances in information technology tools (as evidenced by the Internet), the re-engineering of traditional DOE products and services, the awakening interests of scientific journal publishers to utilize the Internet, the information needs of the DOE research community, and the desire of the GPO to work with other agencies to make electronic government information and tools available to the public. 

Not only is the Internet changing the way publishers are thinking about publishing, but it has impacted how government views its role in the dissemination of scientific and technical information as well.  PubSCIENCE is an outstanding example of converging interests of the user’s desire to access current scientific and technical literature, the Department’s desire to facilitate the flow of peer-reviewed scientific and technical information, and publishers’ interest in obtaining the widest possible visibility for their published materials. 

PrePRINT Network 

The PrePRINT Network was unveiled in January 2000.  It is a searchable gateway to preprint sites that contain information about scientific and technical disciplines of concern to DOE.  Such disciplines include physics, materials, chemistry, and portions of biology, environmental sciences, and nuclear medicine.  Collections and resources included on the PrePRINT Network are provided by academic institutions, government research laboratories, scientific societies, private research organizations, and individual scientists and researchers.  The PrePRINT Network facilitates access to these resources, but does not change the content or data provided by the originating site or author.  The PrePRINT Network combines these dispersed sites into a comprehensive set of energy research information.   

The PrePRINT Network expedites the dissemination of scientists’ research results.  It is Web-based and provides access to energy-related papers, draft journal articles, and other electronic materials produced by researchers.  It provides links to 1000 preprint sites housing over 330,000 documents.  Over twenty heterogeneous preprint databases are available for distributed cross searching via a single query.  In addition, the PrePRINT Network provides links to over 170 related scientific societies and associations. 

The PrePRINT Network offers users three options for locating information.  They can browse or search one specific preprint site or a selected set of sites.  The Browse option allows users to view an alphabetical listing of all of the sites included in the system and to visit any of the individual sites listed.  Within this option, users may also choose to perform an indexed search of the HTML pages of the available sites.  This option returns hits for any pages and for linked pages that contain the specified search term, including some items that may not be actual preprints.  A second option for searching within the PrePRINT Network, Search Selected Sites, allows users to pulse the search engines of selected preprint sites with a single query.  This search capability then compiles the results and returns them to the users.  Thirdly, the Subject Pathways option offers users the ability to browse the preprint collections by subject area.  This section includes both preprint collections and preprints posted by individual scientists on their own sites. 

In most cases, access to the full-text information on the target sites is open, accessible, and free of charge.  By eliminating the need to locate individual preprint sites through web searching, researchers can find more relevant information while saving time.  The PrePRINT Network is a single point of entry for preprints in the scientific and technical areas. 

Additional Digital Collections 

In addition to this trilogy of products that addresses the three main ways by which researchers disseminate their findings, OSTI has built and developed complementing digital collections.  These include:

          DOE R&D Project Summaries, which provides brief descriptions of over 17,000 R&D projects currently ongoing within the DOE

          DOE R&D Accomplishments, which showcases outcomes of past DOE research and development that have had significant economic impact, have improved people’s lives, or have been widely recognized as a remarkable advance in science

          OpenNet, which covers the DOE legacy collection of declassified documents and has been developed and maintained by OSTI for the DOE’s Office of Declassification

          ECAPs, which are electronic current awareness publications providing subject-based collections and are sponsored by DOE Programs

          Federal R&D Project Summaries, which was developed as a proof of principle to demonstrate the value of having a portal to information about Federal research projects 

DOE R&D Project Summaries was unveiled in June 1997 to provide the public with access to key corporate information on over 20,000 research and development projects performed since 1995 by the Department’s laboratories and other research facilities.  It includes DOE research activities in a wide variety of energy-related scientific disciplines.  R&D Project Summaries enables DOE to educate and inform the general public of its current research and development activities and provides a mechanism for public access to information about Departmental research capabilities and activities.  DOE R&D Project Summaries has received DOE’s Information Management Quality Award for Management/Administrative Excellence in 1997 and was recognized with a Hammer Award from Vice President Gore’s National Partnership for Reinventing Government in 1999. 

The DOE R&D Accomplishments Web site showcases the proud heritage of the Department’s research and development and highlights benefits that are being realized now.  It was unveiled in March 1999 as a central forum for providing the public with information about outcomes of past DOE‑sponsored or generated research and development.  The outcomes featured have had significant economic impact, have improved people’s lives, or have been widely recognized as a remarkable advance in science.  The core of the Web site is the DOE R&D Accomplishments Database, consisting of searchable full-text and bibliographic citations of documents reporting accomplishments from DOE and DOE contractor facilities.  Complementing the Database is a page of "Snapshots."  It contains links to items or articles that contain information about or identify at least one research and development accomplishment.  Snapshots are quick pictures, introductions, overviews, or synopses. When more information about a Snapshots topic is available via the DOE R&D Accomplishments Database, links to full-text reports are identified and provided. 

OpenNet provides easy, timely access to recently declassified DOE information, including information declassified in response to Freedom of Information Act requests, and makes it more readily available to the public.  It includes references to all documents declassified and made publicly available after October 1, 1994, and supports the processes envisioned by the Openness Initiative of Public Awareness, Public Education, Public Input, and Public Access.

ECAPs (Electronic Current Awareness Publications) are a collection of bibliographic citations, broken out by subject area, from the Energy Science and Technology Database (EDB).  For DOE reports, links are provided to full text.  These long-standing paper publications were recently transitioned to a searchable Web product.  OSTI publishes several separate ECAPs and maintains a collection of over 30,000 ECAP citations. 

Federal R&D Project Summaries was released in April 2000 and provides a unique window to the Federal research community, allowing Agencies to better understand the research and development efforts of their counterparts in government.  It provides insight to the public in how its investment in research and development is being used and supports full-text single-query searching across databases residing at different governmental Agencies. 

EnergyFiles 

The umbrella for this suite of resources is EnergyFiles, which was released in May 1997.  It is a Web-based virtual library that provides easy access to over 500 widely diverse collections of both DOE and worldwide energy-related STI.  EnergyFiles is a dynamic information system that offers users, participants and contributors the opportunity to leverage collections and capabilities and to maximize use of energy-related scientific and technical information. 

The EnergyFiles search mechanism, EnergyPortal Search, provides for increased site efficiency and ease of knowledge discovery.  EnergyPortal has conquered a major obstacle confronting multi-source virtual libraries.  It is a unique search capability that provides distributed searching across the decentralized, heterogeneous databases and web sites linked to EnergyFiles.  The user no longer needs to select individual links to sift through available information in pursuit of what is relevant.  Words or phrases are entered in a single query box and the query is distributed in parallel to the user-selected multiple databases and Web sites residing at diverse locations. 

EnergyPortal Search continues to represent a breakthrough in information retrieval.  It enables users to search across 26 databases and 500 Web sites using a single query.  The sites are maintained by various agencies, are geographically dispersed, and require no standardization in terms of format, software or metadata.  EnergyPortal will search full text when available; DOE databases and collections; databases of other agencies such as the Defense Technical Information Center (DTIC), the National Aeronautics and Space Administration (NASA), the National Library of Medicine (NLM) and the Environmental Protection Agency (EPA); and other resources.  When the individual database supports it, the searched word or phrase is highlighted for easy access. 

This distributed search capability demonstrates an essential next step in information technology - the integration of parallel searching and retrieving of information from disparate and geographically dispersed databases and Web sites.  The EnergyPortal distributed search transcends other government agencies’ full-text information sources. Since it includes only unlimited, unclassified energy-related information, users in government, industry, academia and the public benefit from the addition of this capability. Time is saved through more efficient and effective information retrieval since the information is accessible on the Internet in an organized, searchable format. 

Awards and recognitions received by EnergyFiles are

          receiving Vice President Gore’s National Performance Review Hammer Award;

          being highlighted in the October 1, 1998, inaugural issue of Access America Online Magazine, a product of the Government Information Technology Services Board;

          being favorably reviewed in the University of Wisconsin’s “Scout Report” for science and engineering; and

          being featured in "Federal Computer Week," with emphasis on EnergyPortal

Future Information Infrastructure for the Physical Sciences 

This history of success lays the foundation for a future Information Infrastructure for the Physical Sciences, which focuses on energy, science, and technology.  The goal of this new initiative is to provide a comprehensive resource, available at the desktop, for worldwide scientific information – a Web-based network where researchers, engineers, educators, students, industry, and the public can come for answers.   

The objectives of the Information Infrastructure for the Physical Sciences are to deliver a permanent, comprehensive resource for accessing and using scientific information; facilitate research and discovery to secure a healthy and competitive science and technology future; raise scientific and technological literacy of all Americans; produce the finest scientists and engineers for the 21st Century; promote scientific research and development (R&D) results as a foundation for future advancements; and establish a digital library that is complementary to existing national libraries in providing Federally-sponsored information to the public.  These existing libraries include the National Library of Medicine, the National Agricultural Library, the National Library of Education, the National Transportation Library, the EPA National Library Network, and the National Science, Mathematics, Engineering, and Technology Education Digital Library (NSDL). 

The Information Infrastructure for the Physical Sciences will significantly expand DOE's local presence across the nation.  Such an active publicly oriented presence will bring scientific and technical information, energy data and prices, and consumer and educational information to the regional level for application and use at the local level.  

The major challenge to the usefulness of a digital library is how to search across heterogeneous databases and Web sites when there is no standardization of data and information resides in multiple forms on a variety of unrelated systems at widely dispersed facilities.  Sophisticated distributed searching capabilities will allow the user to access information without having to know which database to access, which information collection to pursue or the organizational structure of the agency making the information available.  This search capability must be augmented by the ability to deliver the information retrieved, electronically to the desktop, either directly, through licensing agreements, or through other cooperative arrangements. 

User benefits of a digital library focused on energy, science, and technology are a well-organized, comprehensive resource not limited by traditional boundaries; access to both historic and current information; practical information for the consumer; easy, fast, accurate navigation through collections and resources; science education resources for educators and students; remote access to scientific hardware and software; information alert mechanisms to serve industry and commerce, and permanent public access to Departmental resources. 

With the Information Infrastructure for the Physical Sciences, a foundation will be provided for the innovative use of three key resources -- worldwide information, advanced technology, and people -- to deliver validated research information while strengthening and sustaining the Nation’s leadership in science and technology.  Resource requirements, partnership arrangements, and numerous other planning activities are currently being explored as support for this initiative continues to grow.


Author Contact Information
Karen J. Spence
Assistant Director
Office of Program Integration
U.S. Department of Energy
Office of Scientific and Technical Information
P.O. Box 62    
Oak Ridge, TN 37831
Phone: (865) 574-0295 
Fax: (865) 241-3826
spencek@osti.gov

 

Mary V. Schorn
Technical Information Specialist
Office of Program Integration
U.S. Department of Energy   
Office of Scientific and Technical Information
P.O. Box 62
Oak Ridge, TN 37831
Phone: (865) 576-2413
Fax: (865) 241-3826
schornm@osti.gov