"Creating a National Library Focusing on Energy, Science, and
Technology for the U.S. Department of Energy"

Dr. Walter L. Warnick, Director

Office of Scientific and Technical Information
U.S. Department of Energy

Submitted to the Energy Resource 2000 Conference

March 30, 2000


The Need for Basic Research

A young father of two lies in a hospital bed seriously ill. The physician says there is no treatment. The pancreas is secreting substances that are digesting itself and destroying surrounding tissue. Some patients recover on their own; others simply expire. Only time will tell which fate awaits the young father.

Can we doubt that natural laws allow some remedy which shall assist the body's own defenses and cause the pancreas to heal? But if there is such a remedy, why, then, has the physician not used it? The answer is ignorance - not of just one physician, but of the medical profession as a whole.

The physician is waiting for others to discover the remedy. He is waiting because no predecessor mastered the natural laws which govern the pancreas. Knowledge of natural laws requires research.

Let us in the present generation take warning. We cannot plead ignorance about the life or death consequences of investing in research. We see every day the results of research that have so improved people's lives. When you hear about the next life-saving device or therapy, ask yourself how many young fathers and mothers and children died because the discovery did not come sooner. Let us build on the clear lessons of the past and encourage research for the sake of our futures.

The young father in my story is a real person. His name is Vince Dattoria. As it happened, fate was kind to him. He recovered and he is now back working for me at DOE.

Think about it. Vince almost died because of ignorance. Almost every advance in the medical sciences has been made possible by a previous advance in the physical sciences. The human body is a chemical and physical problem, and these sciences must advance before we can conquer disease.

A practice that continues to this day is known as exploratory surgery. It is done in ignorance, when the physician lacks any other way to learn the nature of a patient's problem. It amounts to nothing more than cutting the patient open to see what's inside.

Exploratory surgery is still practiced, but much less commonly than before. This is a very happy development. The decline in exploratory surgery is attributable to the decline in ignorance, pure and simple, thanks in large measure to imaging technology like CT (computer-aided tomography) and MRI (magnetic resonance imaging) scans. CT and MRI scans came from research. For creating the mathematical algorithm necessary for creating the images, DOE physicist Alan Cormack won the Nobel Prize in medicine, not in physics, in 1979.

Because neither CT now MRI can allow the physician to view all internal organs and conditions, exploratory surgery still remains with us. What separates barbarism from medicine is little more than physics and chemistry. Physics and chemistry are what we do in DOE.

The 20th century was the Century of Physics, producing nuclear power, space travel, computers, and numerous other advances. That century has ended, and life sciences now are offering immense opportunities. Too little appreciated is the fact that much of the progress in the life sciences is dependent upon prior advances in the physical sciences.

Of course, the benefits of research in the physical sciences are not limited to medical applications. They are evident in so much of modern technology, from cleaner power plants to the Internet.


Federally Funded Research: Improving People's Lives

Today, almost all basic research is funded by the Federal government. This was not always the case. Years ago, a considerable amount of basic research was sponsored by the private sector.

Scientific research and the knowledge and technologies that follow have been credited with about half of the productivity growth of the United States' economy in the past fifty years.

Continued leadership in science and technology is a cornerstone of our Nation's economic prosperity and growth. Information technology alone accounts for one third of U.S. economic growth and is creating jobs that pay almost 80 percent more than the average private-sector wage.

DOE is among the leading research Agencies in the world. Many of the technologies that are fueling today's economy, such as the Internet, build upon government investments in the 1960's and 1970's. The Department of Energy and its predecessor agencies have been the proud sponsor of science-driven growth through the combined efforts of the national laboratories, 70 Nobel Laureates, and thousands of outstanding university and industry based researchers nationwide. As recently stated by Secretary of Energy Bill Richardson (Science in the 21st Century: A U.S. Perspective, February 1, 2000).

"I don't think anyone can argue with the assertion that science was the author of the 20th century. ... In its ubiquity, science has fundamentally altered how we think of the universe, of the forces that bind it together, and, ultimately, of ourselves."

Much of the Department's R&D is performed by a system of large National Laboratories, e.g., Lawrence Livermore National Laboratory, Oak Ridge National Laboratory, Argonne National Laboratory, and others -- 39 facilities in all with about 100,000 scientists and engineers.

Fueling the Science Mission

DOE invests $7 billion annually in R&D. Relating back to my opening story, the Department of Energy will invest a portion of its budget on medical research in FY 2000, to include:

· $50 million in topics relating to pharmaceuticals, isotope and epidemiological research;
· $90 million relating to genome research;
· $30 million relating to structural biology research.

The principal deliverable from R&D is scientific and technical information. It is in the vital interest of all research agencies that their information be disseminated as broadly and as quickly as possible.

That is the mission of my office: the Office of Scientific and Technical Information (OSTI) collects, preserves, and disseminates scientific and technical information created by the Department. We also provide access to national and global information for use by DOE and the research community. We have been in business for over half a century.

This mission has not changed significantly over the years. Yet Information Age technologies have radically changed the manner in which the mission is carried out.

Total costs of the STI enterprise across DOE are estimated at $200 million annually, a small investment compared to the overall R&D expenditure. OSTI's budget is a mere fraction of this (about 4%).

Information Is Integral to Science and Technology

Information is integral to the science and technology enterprise.

As Secretary Richardson stated, "For science to rapidly advance at the frontiers, it must be open. And shared knowledge is the enabler of scientific progress."

Here's an analogy to demonstrate just how integral information can be to an enterprise. The phone company gives away the "white pages" without a separate incremental charge. The white pages are nothing but information, just like scientific and technical information. Why does the phone company give away the white pages?

Is it because:

(A) The phone company has overlooked an obvious opportunity to recover the cost of the white pages?
(B) The phone company is made up of nice, generous-hearted people who are happy to charitably make this information available at the company
's expense? Or,
(C) The phone company has determined that bundling the cost of the "white pages" with their mainline services is a sound business practice?

Obviously, the answer is C. The phone company has determined, correctly, that disseminating phone information is an integral part of the company's business, and that charging for information would discourage its dissemination and thereby hurt the company's core business.

Similarly, the Federal Government should determine that dissemination of the information coming from its huge R&D investments is what makes R&D useful. Further, government needs to commit sufficient funding to broadly disseminate its information, just as the phone company commits funds to give white pages to all likely users.

What good is basic research unless the resulting information is accessible and used? It is the access to data and information that fuels essential knowledge to advance science and technology.

The government spends billions of dollars to fund research, but it sometimes balks at spending a few dollars to ensure access and preservation to the literature which is the principal deliverable coming from that same research. This is short-sighted.

Bringing Science Information to the Desktop

For 53 years, the Office of Scientific and Technical Information has been managing DOE's technical information program. From the beginning, the fundamental purpose was to ensure that research results were reported and made available to the agency and to the broader scientific community.

In the last three years, Information Age technologies have radically changed our information services. Our patrons increasingly want information right at the desktop. Accordingly, we have transitioned our operations from a paper-based environment to a decentralized electronic environment

Employing Information Age technologies, OSTI is bringing science information to the desktop. In order to accomplish that, we have "harnessed" DOE's information resources in an electronically decentralized environment.

Across the DOE Laboratories, we lead the Scientific and Technical Information Program, a collaboration of information professionals. Practices are in place to link seamlessly to collections of information at multiple laboratories. In one sense, we serve as a pointer to those decentralized collections at our National Labs and elsewhere.

When a Lab elects not to host its own STI, then we do it for them.

This change has been driven by a sense of urgency brought on by the pace of technology: "There are no speed limits on the Web. Only penalties for moving too slow."

We are working hard to share information faster, more completely, more conveniently, and at lower cost.

DOE Information Bridge

Until a few years ago, the method of disseminating DOE research results was through bibliographic databases such as Nuclear Science Abstracts and the Energy Science and Technology Database. Both databases contain "information about information," referring the patron to the paper or microfiche sources of the full text document.

The most significant advance occurred when access was expanded from bibliographic data to full text for grey literature (that is, technical reports and conference papers). DOE is a major producer of scientific and technical grey literature, especially in the parts of the agency that focus on applied R&D.

OSTI's introduction in 1998 of the DOE Information Bridge provided access to the full text of DOE-sponsored grey literature. Each word of each report is searchable in this collection. As of March 1, the Information Bridge had grown to over 62,000 reports and to over 4 million searchable pages covering the period 1995 to present. Working with the Government Printing Office, the Information Bridge is available free to the public at http://www.osti.gov/bridge.

Researchers regularly access this collection and download full text reports at a rate of about 2500 per week.

This collection has won numerous awards, including Vice President Gore's Hammer Award from the National Partnership for Re-inventing Government, commendation from the GPO Depository Library Council, and numerous others.

With the use of this Information Age collection well established for grey literature, DOE turned its sights toward the other ways by which scientists disseminate their findings. The most prevalent form is journal literature.

PubSCIENCE

Following the path forged by the National Library of Medicine with its life sciences product PubMed, we developed PubSCIENCE.  In assessing the need for such a collection in the physical sciences, we worked closely with the American Physical Society. PubSCIENCE filled a void.

PubSCIENCE is the culmination of a lifetime of scientific and technical information dissemination. It was developed to facilitate searching and accessing peer-reviewed journal literature in the physical sciences and other disciplines of interest to DOE.

An exciting feature of PubSCIENCE is that its citations come in a new way: collaborating publishers contribute their citations based on agreements negotiated with OSTI.

PubSCIENCE allows the patron to search across abstracts and citations of multiple publishers at no cost to the patron. The patron need not know ahead of time which journal has the information she seeks. Once the patron has found an interesting abstract, a hyperlink provides access to the publisher's server to obtain the full-text article. The article will come up immediately if the patron or his/her organization has a subscription to the journal. If the patron lacks such a subscription, access to the full text can be obtained by pay per view, by special arrangement with the publisher, library access or through commercial providers.

OSTI's primary patrons are scientists at the DOE system of National Laboratories. PubSCIENCE is particularly attractive to such large institutions, as they are increasingly using site licenses to bring full-text journals to their scientific staffs. For example, Los Alamos National Laboratory has site licenses to well over 2,000 journals. At any institution that has a site license hosted at a publisher's server, the hyperlinks to full-text in PubSCIENCE are automatically live.

Currently PubSCIENCE covers 1032 journals of 26 participating publishers, as well as 1.7 million journal citations. In the future, we plan to continue expanding the number of journal titles in areas of interest to DOE.


PubSCIENCE Ribbon-Cutting

Working in partnership with Government Printing Office (GPO) and 21 publishers, PubSCIENCE was unveiled at a ribbon-cutting event by Energy Secretary Bill Richardson and the Superintendent of Documents Fran Buckley last October. By collaborating with GPO, PubSCIENCE is also available for public use through "GPO Access" at http://www.access.gpo.gov/su_docs/executive.html

Global information sharing has become a reality via the Web, making public availability actually easier and cheaper to implement than restricting access. The response from patrons has been quite favorable.

Some futurists have caused a stir by predicting the demise of traditional scholarly publishing. Lamenting the costs incurred by libraries to purchase subscriptions and delays in publishing, they cite preprint or e-print servers as the wave of the future, to replace traditional publishing.

Indeed, the media of traditional publishing is changing. Most traditional journals are now available electronically. But, we should not equate an evolution of media, from paper to electronic, with a fundamental shift away from journals. I see no real pressure coming from scientists to forsake journal publishing. Indeed, a recent survey of scientists showed that, when selecting the journal to submit their papers, the most important factor in their decision, by far, is prestige of the journal. Not cost of the journal, not speed of publication, but prestige.

PrePRINT Network

DOE's most recent Web-based product however is not PubSCIENCE, but rather the PrePRINT Network, launched on January 31. The PrePRINT Network http://www.osti.gov/preprint is a gateway to sources of preprints dealing with scientific disciplines such as physics, materials, chemistry and others of concern to DOE. My office does not operate any of the preprint servers. Rather, the PrePRINT Network is a gateway to the universe of preprint servers, which number over 625 with over 235,000 preprints. We have tried to capture every preprint server in the world within scope.

The patron has several options for searching, including querying multiple preprint servers via a single query or browsing by subject. When the patron places a query, then PPN accesses several selected databases, causes searches to be done by their search engines, and then compiles the results. Essentially, the network is acting as a PARALLEL PROCESSOR, uniquely created for searching across preprint servers that do not have standardized data formats and are geographically dispersed. A patron need not know ahead of time which preprint server has the information he seeks.

EnergyFiles: Virtual Library Collections of Energy Science and Technology

The PPN is the latest in a series of Web-based services developed by OSTI and made available to the public through EnergyFiles: Virtual Library Collections of Energy Science and Technology.

The parallel processor searching capability is used here in the EnergyPortal search, a feature within the Virtual Library Collections of Energy Science and Technology.  Through this "portal," over 500 collections of scientific and technical information can be searched. EnergyPortal enables distributed searching across geographically dispersed databases and various government and private/non-profit Web sites.

EnergyFiles also has 14 energy-related subject pathways for patrons to choose from or a multidisciplinary search can be invoked.

The implications for building inexpensive distributed digital libraries are truly profound.

Building Knowledge Assets

With the addition of preprints to our suite of Web products and services, the trilogy of ways by which researchers make their results known are now accessible on the Web:

· Grey literature, through the DOE Information Bridge;
· Journal literature, through PubSCIENCE; and
· Preprints, through the PrePRINT Network

Whereas a few years ago, scientists communicated their findings primarily by two methods, grey literature and journal literature, they now have preprints as an increasingly popular third way to communicate. My personal view is that this mix of three ways by which scientists communicate their findings will persist far into the future.

Each way has its own set of strengths and weaknesses. That is why we at DOE have determined not to mix products. Journal literature, grey literature, and preprints are separated. Patrons have a distributed search system to pulse all these collections with one query, if they choose, but we do not want users to lose sight of the type of literature they are viewing.

Each of these is a vast virtual collection. In each case the information is accessible via an OSTI web site, but the full-text information resides at servers all over the U. S. or the world.

Other digital collections developed by OSTI to meet the needs of DOE's R&D community include:

· R&D Accomplishments showcases outcomes of past DOE R&D that have had significant economic impact, improved      people's lives, or been widely recognized as a remarkable advance in science.
· R&D Project Summaries provide brief descriptions of over 17,000 R&D projects currently ongoing within the DOE.
· OpenNet covers the DOE legacy collection of declassified documents, developed and maintained by OSTI for the Office      of Declassification.
· ECAPS are current awareness publications that are subject based collections sponsored by DOE Programs. Recently        these have transitioned to searchable Web publications with links to full text.

Our aims are simple. We aim to be FIRST in grey literature, FIRST in journal literature, FIRST in preprints, and FIRST in the hearts of our researchers. This vision is what has gotten my office so excited. It is tantamount to conquering text in the physical sciences, a goal that has never before been within human reach.

Definition of a Digital Library

This is my definition of a Digital Library. It is important to note that a mere collection of pointers to databases and information resources does not make a digital national library.

Just as the days of library card catalogs made information retrieval possible in the paper world, sophisticated distributed searching capabilities are a requirement for the digital age. This search capability allows the patron to access information without having to know which database to access, which information collection to peruse, or the organizational structure of the agency making the information available.

And, this search capability must be augmented by the ability to deliver the information retrieved, electronically to the desktop, either directly, through licensing agreements, or through other cooperative arrangements.

By working with this definition in building new information collections, the core foundation for our future vision continues to grow.

Digital National Library Model

The major challenge to the usefulness of a digital virtual library is how to search across heterogeneous databases and Web sites when there is no standardization of data and information resides in multiple forms on a variety of unrelated systems at widely dispersed facilities.

This model is one concept of an operational approach, based on DOE's transition to electronic information. We have partnerships with both the "suppliers" of STI - whether the information is hosted by them or by OSTI - and the "customers" or "patrons" of the information.

Given the numerous national library initiatives now emerging, even more content is coming to the web.

National Libraries

Three Cabinet-level Agencies have National Libraries. They include:

· The National Institutes of Health (National Library of Medicine); 
· The Department of Education (National Library of Education); and
· The Department of Agriculture (National Agricultural Library).

Each of these is making great strides with digital collections.

By any measure, the National Library of Medicine (NLM) is the leader. They produced PubMed several years ago, and it has become the single most-used collection of information in medicine. Recently, the National Library of Medicine launched PubMed Central, which - unlike PubMed - hosts full text of journal articles and preprints on NLM servers. DOE imitated PubMed when it created PubSCIENCE but has no plans to emulate PubMed Central.

The National Agricultural Library has recently made its AGRICOLA database freely available on the Web. It differs from PubMed and PubSCIENCE in two ways: (1) the obvious difference in subject matter, and (2) AGRICOLA does not offer hyperlinks to full text.

The National Library of Education has recently made its ERIC database freely available on the Web. ERIC has features very similar to AGRICOLA.

Two additional agencies, the Environmental Protection Agency (EPA National Library Network Program) and the Department of Transportation (DOT National Transportation Library), have Web sites offering access to extensive collections of online information and call themselves National Libraries.

Additionally, the National Science Foundation (NSF) has a program solicitation for the National Science, Mathematics, Engineering, and Technology Education Digital Library (NSDL).  Focusing on learning resources, it has $13 million of new money, and a solicitation for grant proposals is on the street.

The mere existence of a National Digital Library not only fosters the dissemination of information, but its preservation as well. It would be the surest way to promote permanent public access to government information. Additionally, the term National Library announces to the world that the agency has information resources of which it is proud.

The nation needs a National Library focusing on energy, science, and technology -- a place where researchers, educators, students, and citizens can come for answers.

There is great potential to use information technology to improve people's lives. The Internet and other information technologies are changing the way we communicate, learn, and conduct business. These technologies are shaping our economy and our society in the same way that the steam engine and electricity defined the Industrial Age.

The U.S. Government is also striving to make greater use of the Internet. In December 1999, the President issued a memorandum for the heads of Executive Departments and Agencies on the "Use of Information Technology to Improve Our Society." Expectations are that more information and services will be made available electronically.

The Federal science agencies have mutual interests:

· Inform the public about government programs.
· Provide permanent access to a comprehensive collection of current and retrospective federal government information.
· Assist in locating particular fields in government collections.

DOE's National Library Initiative

Within DOE, the concept of a National Library focused on energy, science, and technology is currently being considered. It is still at the idea stage, but indications are that it has merit.

As Secretary Richardson says, "[DOE] needs to let the American people know what DOE is doing for them." A National Library would advance scientific understanding while also showcasing for the public the many DOE scientific contributions that benefit them.

Our objectives for a Digital National Library of Energy Science and Technology include:

· Providing access to information through worldwide delivery;
· "One-stop-shopping" for the agency's various sources of information through a core node or central point of access;
· Full-text and other media resources in addition to mature bibliographic databases;
· A skilled information management staff; and
· Achieving permanent public access while also instituting information preservation.

Access to scientific information is integral to the success of the U.S. efforts in education and research. A National Library is one way we can seize the opportunities being offered by the information society. It would benefit DOE, the scientific community, as well as the American and global public.