Building
Energy Science And Technology Digital Collections
For an Information Infrastructure For the Physical Sciences
Virtual Reference Desk (VRD) Conference, which was held in Seattle, Washington on Oct. 16-17, 2000
Karen J. Spence and Mary V. Schorn
Abstract
The
U.S. Department of Energy Office of Scientific and Technical Information
provides a suite of innovative digital information resources for a stronger
America. Included in these
resources are world-class products that address the three main ways by which
researchers disseminate their findings: the DOE Information Bridge (grey
literature), PubSCIENCE (journal literature), and the PrePRINT
Network (preprints). These
products are key components of the suite of resources provided through EnergyFiles,
a virtual library of energy-related scientific and technical information. Each product can be searched individually or in parallel with
other energy-related resources using EnergyPortal, which is the
groundbreaking distributed search mechanism of EnergyFiles.
This history of success lays the foundation for OSTI’s new
initiative, a future Information Infrastructure for the Physical Sciences.
Introduction
The Department of Energy
(DOE) is among the leading research agencies in the world, investing $7
billion annually in research and development (R&D). It is vitally important for research agencies to disseminate
their information as broadly and as quickly as possible, providing access to
data and information that fuels essential knowledge. For over 50 years DOE’s Office of Scientific and Technical
Information (OSTI) has been collecting, preserving, and disseminating the
Department’s scientific and technical information (STI).
By utilizing Information Age technologies, OSTI has radically changed
its information services and has developed a suite of award-winning Internet
resources that bring science information to the desktop at no cost to the
user. These resources provide
easier, faster, cheaper, more complete, and more convenient means of accessing
and using global STI by scientists, researchers, academia, industry, and the
public.
One-stop
shopping access to this suite of resources is provided through EnergyFiles,
a Web-based virtual library that provides easy access to collections of both
DOE and worldwide energy-related scientific and technical information.
EnergyFiles contains a search mechanism, EnergyPortal, that is
easy to use, integrates parallel searching, and retrieves information from
heterogeneous and geographically dispersed databases and Web sites. Key components of EnergyFiles are DOE R&D
Project Summaries (current research), DOE R&D Accomplishments
(outcomes of past research), the DOE Information Bridge (grey
literature), PubSCIENCE (peer-reviewed journal literature), and the PrePRINT
Network (preprints). These
products have been designed to provide remote access to billions of dollars of
energy‑related research performed by DOE and its collaborators.
DOE
Information Bridge
The
DOE Information Bridge, made available in April 1998 in collaboration with the U.S. Government
Printing Office (GPO), contains DOE report literature from 1995 forward.
It incorporates over 55,000 full-text reports comprising over 4.3
million pages. It provides free, convenient, and quick access to
full‑text DOE research and development reports in physics, chemistry,
materials, biology, environmental sciences, energy technologies, engineering,
computer and information science, renewable energy, and other topics.
Users remotely access and download the reports free of charge and in
significant volume.
The
DOE Information Bridge focuses on providing access to scientific and
technical reports produced by DOE, DOE national laboratories and DOE
contractors. New reports
processed by OSTI are added routinely and legacy reports are added as
resources permit. Since its
introduction, the content of DOE Information Bridge has more than
doubled.
DOE
Information Bridge search options
include a basic approach that can be concentrated on specific data fields and
an advanced approach that includes Boolean operators to increase search
precision. Users can search the entire collection (full text and
bibliographic data), or they can search portions of it.
Three formats, GIF, PDF and TIFF, are available for viewing the
full‑text page images. Two formats, PDF (image only) and the original input format,
are available for downloading full‑text documents.
This makes reports far easier to use and eliminates the cumbersome and
time-consuming practices associated with searching traditional media.
Awards
and recognitions received by the DOE Information Bridge include
•
a commendation to DOE and GPO by the Depository Library Council to the
Public Printer;
•
receiving Vice President Gore’s National Performance Review Hammer
Award;
•
receiving the DOE Information Management Technical Excellence Award;
•
being highlighted in the October 1, 1998, inaugural issue of Access
America Online Magazine, a product of the Government Information
Technology Services Board;
•
being favorably reviewed in the University of Wisconsin's "Scout
Report" for science and engineering; and
•
being cited by the Global SchoolNet Foundation and Yahoo (Pick of the
Day and Week).
Building
and expanding the DOE Information Bridge reinforces DOE’s and GPO’s
commitment to make available DOE research reports and to move federal programs
and activities into the ever-expanding world of the Information Age.
PubSCIENCE
PubSCIENCE was developed to facilitate searching and accessing peer-reviewed
journal literature in the physical sciences and other disciplines of interest
to DOE. Made available in
collaboration with the Government Printing Office (GPO) in October 1999, it
provides for quick, easy, and free searching of a compendium of peer-reviewed
journal citations and abstracts about the physical sciences and other
energy-related disciplines. Hyperlinks
provide access to publisher servers to obtain full-text articles if the user
or organization has a subscription to the journal.
If the user lacks such a subscription, access to the full text can be
obtained by pay per view, by special arrangement with the publisher, by
library access, or through commercial providers.
Over
twenty-five of the most prestigious publishers in the world today are
represented in PubSCIENCE, which facilitates searching of and access to
over 1.8 million records in over 1000 journal titles of peer-reviewed
scientific and technical information.
PubSCIENCE
is the convergence of recent advances in information technology tools (as
evidenced by the Internet), the re-engineering of traditional DOE products and
services, the awakening interests of scientific journal publishers to utilize
the Internet, the information needs of the DOE research community, and the
desire of the GPO to work with other agencies to make electronic government
information and tools available to the public.
Not
only is the Internet changing the way publishers are thinking about
publishing, but it has impacted how government views its role in the
dissemination of scientific and technical information as well.
PubSCIENCE is an outstanding example of converging interests of
the user’s desire to access current scientific and technical literature, the
Department’s desire to facilitate the flow of peer-reviewed scientific and
technical information, and publishers’ interest in obtaining the widest
possible visibility for their published materials.
PrePRINT
Network
The
PrePRINT Network was unveiled in January 2000. It
is a searchable gateway to preprint sites that contain information about
scientific and technical disciplines of concern to DOE.
Such disciplines include physics, materials, chemistry, and portions of
biology, environmental sciences, and nuclear medicine.
Collections and resources included on the PrePRINT Network are
provided by academic institutions, government research laboratories,
scientific societies, private research organizations, and individual
scientists and researchers. The PrePRINT
Network facilitates access to these resources, but does not change the
content or data provided by the originating site or author.
The PrePRINT Network combines these dispersed sites into a
comprehensive set of energy research information.
The
PrePRINT Network expedites the dissemination of scientists’ research
results. It is Web-based and
provides access to energy-related papers, draft journal articles, and other
electronic materials produced by researchers.
It provides links to 1000 preprint sites housing over 330,000
documents. Over twenty
heterogeneous preprint databases are available for distributed cross searching
via a single query. In addition,
the PrePRINT Network provides links to over 170 related scientific
societies and associations.
The
PrePRINT Network offers users three options for locating information.
They can browse or search one specific preprint site or a selected set
of sites. The Browse option
allows users to view an alphabetical listing of all of the sites included in
the system and to visit any of the individual sites listed.
Within this option, users may also choose to perform an indexed search
of the HTML pages of the available sites.
This option returns hits for any pages and for linked pages that
contain the specified search term, including some items that may not be actual
preprints. A second option for
searching within the PrePRINT Network, Search Selected Sites, allows
users to pulse the search engines of selected preprint sites with a single
query. This search capability
then compiles the results and returns them to the users. Thirdly, the Subject Pathways option offers users the ability
to browse the preprint collections by subject area. This section includes both preprint collections and preprints
posted by individual scientists on their own sites.
In
most cases, access to the full-text information on the target sites is open,
accessible, and free of charge. By
eliminating the need to locate individual preprint sites through web
searching, researchers can find more relevant information while saving time.
The PrePRINT Network is a single point of entry for preprints in
the scientific and technical areas.
Additional
Digital Collections
In addition to this trilogy of products that addresses the three main ways by which researchers disseminate their findings, OSTI has built and developed complementing digital collections. These include:
•
DOE R&D
Project Summaries, which
provides brief descriptions of over 17,000 R&D projects currently ongoing
within the DOE
•
DOE
R&D Accomplishments, which
showcases outcomes of past DOE research and development that have had
significant economic impact, have improved people’s lives, or have been
widely recognized as a remarkable advance in science
•
OpenNet,
which covers the DOE legacy collection of declassified documents and has been
developed and maintained by OSTI for the DOE’s Office of Declassification
•
ECAPs,
which are electronic current awareness publications providing subject-based
collections and are sponsored by DOE Programs
•
Federal
R&D Project Summaries, which
was developed as a proof of principle to demonstrate the value of having a
portal to information about Federal research projects
DOE
R&D Project Summaries was unveiled in June 1997 to provide the public with access to key
corporate information on over 20,000 research and development projects
performed since 1995 by the Department’s laboratories and other research
facilities. It includes DOE
research activities in a wide variety of energy-related scientific
disciplines. R&D Project
Summaries enables DOE to educate and inform the general public of its
current research and development activities and provides a mechanism for
public access to information about Departmental research capabilities and
activities. DOE R&D
Project Summaries has received DOE’s Information Management Quality
Award for Management/Administrative Excellence in 1997 and was recognized with
a Hammer Award from Vice President Gore’s National Partnership for
Reinventing Government in 1999.
The
DOE R&D Accomplishments Web site showcases the proud heritage of the Department’s research and
development and highlights benefits that are being realized now.
It was unveiled in March 1999 as a central forum for providing the
public with information about outcomes of past DOE‑sponsored or
generated research and development. The
outcomes featured have had significant economic impact, have improved
people’s lives, or have been widely recognized as a remarkable advance in
science. The core of the Web site
is the DOE R&D Accomplishments Database, consisting of searchable
full-text and bibliographic citations of documents reporting accomplishments
from DOE and DOE contractor facilities. Complementing
the Database is a page of "Snapshots."
It contains links to items or articles that contain information about
or identify at least one research and development accomplishment.
Snapshots are quick pictures, introductions, overviews, or synopses.
When more information about a Snapshots topic is available via the DOE
R&D Accomplishments Database, links to full-text reports are
identified and provided.
OpenNet
provides easy, timely access to recently declassified DOE information,
including information declassified in response to Freedom of Information Act
requests, and makes it more readily available to the public.
It includes references to all documents declassified and made publicly
available after October 1, 1994, and supports the processes envisioned by the
Openness Initiative of Public Awareness, Public Education, Public Input, and
Public Access.
ECAPs
(Electronic Current Awareness Publications) are a collection of
bibliographic citations, broken out by subject area, from the Energy Science
and Technology Database (EDB). For
DOE reports, links are provided to full text.
These long-standing paper publications were recently transitioned to a
searchable Web product. OSTI publishes several separate ECAPs and maintains a
collection of over 30,000 ECAP citations.
Federal
R&D Project Summaries was released in April 2000 and provides a unique window to the Federal
research community, allowing Agencies to better understand the research and
development efforts of their counterparts in government.
It provides insight to the public in how its investment in research and
development is being used and supports full-text single-query searching across
databases residing at different governmental Agencies.
EnergyFiles
The
umbrella for this suite of resources is EnergyFiles,
which was released in May 1997. It
is a Web-based virtual library that provides easy access to over 500 widely
diverse collections of both DOE and worldwide energy-related STI.
EnergyFiles is a dynamic information system that offers users,
participants and contributors the opportunity to leverage collections and
capabilities and to maximize use of energy-related scientific and technical
information.
The
EnergyFiles search mechanism, EnergyPortal
Search, provides for increased site efficiency and ease of knowledge
discovery. EnergyPortal
has conquered a major obstacle confronting multi-source virtual libraries.
It is a unique search capability that provides distributed searching
across the decentralized, heterogeneous databases and web sites linked to EnergyFiles. The user no longer needs to select individual links to sift
through available information in pursuit of what is relevant.
Words or phrases are entered in a single query box and the query is
distributed in parallel to the user-selected multiple databases and Web sites
residing at diverse locations.
EnergyPortal
Search continues to represent a breakthrough in information retrieval.
It enables users to search across 26 databases and 500 Web sites using
a single query. The sites are maintained by various agencies, are
geographically dispersed, and require no standardization in terms of format,
software or metadata. EnergyPortal
will search full text when available; DOE databases and collections; databases
of other agencies such as the Defense Technical Information Center (DTIC), the
National Aeronautics and Space Administration (NASA), the National Library of
Medicine (NLM) and the Environmental Protection Agency (EPA); and other
resources. When the individual
database supports it, the searched word or phrase is highlighted for easy
access.
This
distributed search capability demonstrates an essential next step in
information technology - the integration of parallel searching and retrieving
of information from disparate and geographically dispersed databases and Web
sites. The EnergyPortal
distributed search transcends other government agencies’ full-text
information sources. Since it includes only unlimited, unclassified
energy-related information, users in government, industry, academia and the
public benefit from the addition of this capability. Time is saved through
more efficient and effective information retrieval since the information is
accessible on the Internet in an organized, searchable format.
Awards
and recognitions received by EnergyFiles are
•
receiving Vice President Gore’s National Performance Review Hammer
Award;
•
being highlighted in the October 1, 1998, inaugural issue of Access
America Online Magazine, a product of the Government Information
Technology Services Board;
•
being favorably reviewed in the University of Wisconsin’s “Scout
Report” for science and engineering; and
•
being featured in "Federal Computer Week," with emphasis on EnergyPortal.
Future
Information Infrastructure for the Physical Sciences
This
history of success lays the foundation for a future Information
Infrastructure for the Physical Sciences, which focuses on energy,
science, and technology. The goal
of this new initiative is to provide a comprehensive resource, available at
the desktop, for worldwide scientific information – a Web-based network
where researchers, engineers, educators, students, industry, and the public
can come for answers.
The
objectives of the Information Infrastructure for the Physical Sciences
are to deliver a permanent, comprehensive resource for accessing and using
scientific information; facilitate research and discovery to secure a healthy
and competitive science and technology future; raise scientific and
technological literacy of all Americans; produce the finest scientists and
engineers for the 21st Century; promote scientific research and
development (R&D) results as a foundation for future advancements; and
establish a digital library that is complementary to existing national
libraries in providing Federally-sponsored information to the public.
These existing libraries include the National Library of Medicine, the
National Agricultural Library, the National Library of Education, the National
Transportation Library, the EPA National Library Network, and the National
Science, Mathematics, Engineering, and Technology Education Digital Library (NSDL).
The
Information Infrastructure for the Physical Sciences will significantly
expand DOE's local presence across the nation.
Such an active publicly oriented presence will bring scientific and
technical information, energy data and prices, and consumer and educational
information to the regional level for application and use at the local level.
The major challenge to the
usefulness of a digital library is how to search across heterogeneous
databases and Web sites when there is no standardization of data and
information resides in multiple forms on a variety of unrelated systems at
widely dispersed facilities. Sophisticated
distributed searching capabilities will allow the user to access information
without having to know which database to access, which information collection
to pursue or the organizational structure of the agency making the information
available. This search capability
must be augmented by the ability to deliver the information retrieved,
electronically to the desktop, either directly, through licensing agreements,
or through other cooperative arrangements.
User benefits of a digital
library focused on energy, science, and technology are a well-organized,
comprehensive resource not limited by traditional boundaries; access to both
historic and current information; practical information for the consumer;
easy, fast, accurate navigation through collections and resources; science
education resources for educators and students; remote access to scientific
hardware and software; information alert mechanisms to serve industry and
commerce, and permanent public access to Departmental resources.
With
the Information Infrastructure for the Physical Sciences, a foundation
will be provided for the innovative use of three key resources -- worldwide
information, advanced technology, and people -- to deliver validated research
information while strengthening and sustaining the Nation’s leadership in
science and technology. Resource requirements, partnership arrangements, and numerous
other planning activities are currently being explored as support for this
initiative continues to grow.
Karen J. Spence
Assistant Director Office of Program Integration U.S. Department of Energy Office of Scientific and Technical Information P.O. Box 62 Oak Ridge, TN 37831 Phone: (865) 574-0295 Fax: (865) 241-3826 spencek@osti.gov
|
Mary V. Schorn Technical Information Specialist Office of Program Integration U.S. Department of Energy Office of Scientific and Technical Information P.O. Box 62 Oak Ridge, TN 37831 Phone: (865) 576-2413 Fax: (865) 241-3826 schornm@osti.gov |