Grey Literature in Energy: A Shifting Paradigm

By Deborah E. Cutler, U.S. DOE/OSTI
October 4-5, 1999

The Fourth International Conference on Grey Literature

There were challenges to undertake and exciting new partnerships to forge as the exchange medium for energy information moved from paper and microfiche to electronic formats. As the paradigm for storing, retrieving, and disseminating energy-related grey literature has shifted and evolved, the Office of Scientific and Technical Information (OSTI), the Department of Energy's (DOE) information arm, has led the way. At this particular stage in the shift, OSTI has overcome a number of challenges and recognizes that there are still more to come. Dealing successfully with those challenges has come only as a result of a commitment to vision and cooperation/support from domestic and international partners. Of course the vehicles provided by the Internet, the World Wide Web, and related technologies were key, as well. OSTI has been able to offer the user community direct desktop access to a significant volume of energy-related grey literature. That access is FREE.


Producers and users of information have found themselves in a rapidly changing environment as the Internet and its support technologies have exploded. A companion event to that technology explosion has been the slashing of government budgets and dramatic downsizing of the government workforce. Government agencies can no longer afford to wait 10 years for proven solutions from private industry. There is an immediate need for answers to critical questions. How do you do more with less? What is the most efficient way to meet and exceed customer expectation? How can you keep commitments? How do you remove the "grey" from the traditional definition of grey literature? OSTI learned the answer to "how"-you take risks, you jump on the bandwagon, you just do it.


OSTI has been in the forefront and become a leader among US government agencies in offering grey literature via the Web. Evolving from the traditional world of paper and microfiche to the electronic world of PDFs and TIFFs has come quickly and not without growing pains. This paper documents OSTI's journey toward that new frontier and the challenges faced along the way.


The old and new frontier.

Although the name and chain of command have shifted several times over the years, OSTI's core mission to collect, preserve, and disseminate scientific and technical information (STI) has remained the same. Over the last 52 years, OSTI's collection of STI has grown to over 1.5 million Energy R&D reports (traditionally viewed as grey literature) and approximately 5 million bibliographic citations representing worldwide energy R&D. The current database, today known as the Energy Science and Technology Database (EDB), has over 3.8 million citations. Over the past few years, OSTI has made significant strides into the information age: defining new electronic exchange formats; creating and linking to collections of digitized STI; serving researchers directly; and developing a Web product called EnergyFiles: Virtual Library of Energy Science and Technology. As of September1999, the DOE Information Bridge, a component of EnergyFiles, has over 56,000 full-text R&D grey literature reports (approximately 3.8 million pages), capturing a significant portion of DOE and global R&D output since 1996. The global information comes in through OSTI's partners in international agreements . To keep to the terms of these agreements, access to the full DOE Information Bridge (https://apollo.osti.gov/dds) is limited to the DOE and its contractor community. There is, however, public access to DOE's grey literature.

The public version of the DOE Information Bridge (http://www.doe.gov/bridge) offered in collaboration with the U.S. Government Printing Office (GPO) Depository Library system, has over 43,000 DOE R&D grey literature reports (more than 2.8 million pages) that are electronically available and free to the public worldwide. OSTI is proud to have been among the first sites to make such a large volume of what used to be considered grey literature available to the Internet literate, winning several governmental awards for its efforts. Over the past 6 months, the two Information Bridge products combined averaged more than 2200 full text reports being downloaded weekly, far exceeding traditional distribution methods of the past. The grey literature is not simply available, but it is being USED.

The paradigm of grey literature being hard to get has indeed shifted for energy-related STI. So, how did we get there?

5 years ago.

OSTI has always had responsibility for DOE's grey literature. For many years DOE and its contractors were required to provide paper copies of their reports to OSTI for public dissemination. Storage and dissemination at OSTI changed over time from paper, to microcards, and then on to microfiche. A bibliographic database citing the reports and other commercially published energy information was also created. Citations included abstracts and added subject indexing to facilitate retrieval of the information. Public dissemination of the DOE grey literature was accomplished through partnerships with the GPO Depository Library system and with the Department of Commerce's National Technical Information System (NTIS). Microfiche and paper continued to be the only ways to get this grey literature for many years. The constant challenge was to improve the timeliness of dissemination, because of the resource intensive nature of the storage and retrieval methods available to fill orders.

In addition to collecting and disseminating its own DOE grey literature, OSTI has a long history of international exchange of grey literature and related bibliographic information. OSTI's chief multilateral agreements are with the International Energy Agency Energy Technology Data Exchange (IEA/ETDE) and the International Atomic Energy Agency International Nuclear Information System (IAEA/INIS). ETDE, begun in 1987, currently has 18 countries exchanging the full scope of energy-related information (http://www.etde.org) while INIS, begun in 1970, has over 100 countries and 19 international organizations exchanging primarily nuclear-focused information (http://www.iaea.or.at/inis/inis.htm) From the inception of these agreements, OSTI has served as the US delegate and met US commitments as well as possible given recurring budget restraints. Since ETDE's inception, OSTI has managed and operated ETDE on behalf of its members as Operating Agent. As within DOE, five years ago the paradigm for grey literature exchange in these agreements was also paper and microfiche.

Drivers for change.

The 1990's of course brought better and faster computers, much wider use of the Internet, and a vastly more computer literate workforce. At the same time, reduced budgets and downsizing of government became the norm. OSTI in particular saw its budget decline by nearly half even though information and technology were continually recognized and touted as the new wave of the future. While new technologies are allowing for cost efficiencies, an up front investment is usually required. How to cope in this new environment with the new resource limitations became a formidable challenge. User expectations began to shift rapidly. Few would argue that the Internet considerably influenced the mind set of users who now want instant gratification. Information users today have little patience to wait weeks for a copy of a document; even waiting minutes for file downloads now seems like an eternity. The concept that because something is free it has less value has decidedly lost its influence. Many users today gravitate to what's easiest to get rather than the best or most scientifically correct. Bibliographic information alone is passe - users demand full text... Now!

To address resource issues as well as the new environment, OSTI knew it had to take advantage of new technologies and work with its partners wisely. It was no secret that more and more documents were being created in electronically shareable formats - as opposed to the sometimes cumbersome strict publishing formats of the past. Sites throughout the DOE community were evolving internally to new ways of producing their reports. Technology offered TIFFs, GIFs and PDFs as new options in the electronic viewing and printing of documents, without requiring users to have specific word processing software. Full text searching was becoming a real possibility, bringing with it some schools of thought that bibliographic information would someday be obsolete. The vision, as OSTI saw it, was to give information to the user at the desktop, and this meant electronic full text.

In 1996, OSTI made a strategic decision to migrate to electronic full text and transition out of microfiche production. The challenge: what format(s) do you standardize on? Since DOE documents were continuing to come to OSTI in paper form, the format chosen for storage and manipulation was TIFF Group IV, and a scanning process was established. Microfiche production continued in parallel for some time, but was modified to use electronic TIFF images instead of the traditional paper and camera methods.

The next challenge was to choose a dissemination outlet. Thus the DOE Information Bridge concept was born. This Web concept had the bibliographic record retained as the base for searching but added a hyperlink to the full text. In addition, some full text searching capability was built into the product line. This was achieved by an automated OCR process of the scanned TIFF images to build a text index. Although no manual clean-up of the data was done, a good quality copy would produce quite good results, and served to enhance bibliographic searching.

Another challenge faced during this time frame was whether access should be free. Should users see the information for free? Should they be able to download whole documents or only single pages? As a government entity, OSTI's work was already funded by taxpayer dollars. But this was added-value, and extra service. Then again billing meant accounting systems and additional resources. The end result was free access, and the capability for both page at a time viewing and full document downloads.

The next goal was to get partners involved. In collaboration with the DOE STI partners, regulations were revised to state that the preferred method of receipt of DOE reports would be electronic, and that soon dissemination too would be electronic. In early phases, only 5 formats were to be permitted on receipt: SGML, HTML, TIFF Group IV, PDF, and Postscript. The receipt format became known as the native format. At the time, SGML looked the most promising for scientific exchange of information, although experience has not borne that out. While OSTI does receive a few documents in SGML, the largest majority have been in PDF, followed by TIFF, then HTML, along with still quite a lot of paper. In recent times, OSTI has added acceptance of some word processing formats, but receipts to date have been few in number. TIFF images continue to be created from any paper received.

Regardless of the native format, OSTI initially chose TIFF Group IV as the standard for storage, dissemination and archival purposes. Over time, however, PDF has taken a dominant role, and PDF-wrapped TIFF images are also offered for dissemination for all documents residing on the Information Bridge. It is not clear yet whether OSTI will move to only storing PDF for archive, but it is a real possibility given the storage requirements to handle native, PDF and TIFF copies of each document. The challenge of bulk delivery of electronic full text was met by choosing the 8mm DAT as the output media for the TIFF images (in addition to their availability via the Information Bridge product line.) Short SGML files for each document accompany the TIFF images on DAT, to provide brief bibliographic information.

The next challenge was to involve international partners. OSTI's move to electronic full text certainly had impacts on these partnerships. OSTI worked closely with the INIS Secretariat in Vienna, Austria to share future directions and ideas, since US information made up a considerable portion of the INIS system. INIS also chose to migrate to electronic full text, with TIFF as the primary storage format following their own scanning procedures. However, INIS chose a CD-ROM companion product to the INIS bibliographic database as their dissemination media to members. With a number of countries in INIS still having limited Internet access, the CD-ROM option was chosen as the best choice to make full text available to all member countries. The main drawback of this system has been the number of CD-ROMs necessary to hold the data. INIS is currently looking at DVD options to condense the number required for a full collection, but much depends on the availability of the technology in those same countries where Internet is not a universal solution.

INIS has also chosen to transition out of microfiche dissemination, but decided to retain microfiche production for archival purposes, although it too is produced from TIFF images. In addition, OSTI receives an 8mm DAT of TIFF images from INIS to load as part of the limited access Information Bridge product, and provides INIS with the US full text information on DAT.

With OSTI as the ETDE Operating Agent, plans were shared within the ETDE community. ETDE member countries were encouraged to submit data electronically, and the Information Bridge and 8mm DAT dissemination were offered to ETDE input centers. Several countries receive the electronic full text to serve users in their country more directly. As Operating Agent, OSTI scans in grey literature in paper form received from ETDE members to create electronic full text. This is then added to the Information Bridge limited access product. While some documents are received at INIS and ETDE in electronic format, the large majority continue to come in paper form. The last year has shown movement in some countries, however, to improve this situation.

Since 1996, many issues have arisen and have been met regarding electronic full text. Some that continue to be considered include which formats should be allowed, and which should be considered the archive. How can searching be improved? An issue related to ownership arose within the international community, that DOE was fortunate enough not to face. Some full text sources who were perfectly content with allowing microfiche copies of their documents to be distributed were not so eager to allow the proliferation of electronic versions. For some countries, this caused the unexpected side effect of a decline in the number of documents that could be made available by ETDE and INIS, making that literature even more grey in terms of ready access.

As electronic availability became more the norm within the DOE community, OSTI has offered DOE sites the option to send a URL to their documents, if the sites already have them on Internet. Within the Information Bridge structure, users can still hyperlink to the documents, but they will not physically reside at OSTI. Thus, storage of electronic full text is moving to a decentralized model even though retrieval continues to be facilitated by a centralized database. Archive issues remain a concern with this model, but sites are supposed to notify OSTI before eliminating their Internet access. Coming soon will be OSTI's implementation of persistent URLs or (PURLs) to allow more direct access to documents and to minimize the impact on users if documents are moved to a new location.

Still to come?

For OSTI, new versions of the Information Bridge product line will soon be available, adding some enhancements for users. A popular request from users that will not be implemented right away due to cost is to make the full text searchable once downloaded. Neither TIFF nor the PDF-wrapped TIFF allow such functionality. While OSTI has looked into various options for doing so, all take time and resources.

EnergyFiles (http://www.osti.gov/EnergyFiles), OSTI's virtual library product that directs users to all types of energy information, continues to grow, with distributed searching options now being offered. In addition, a new product called PubSCIENCE (http://www.osti.gov/pubsci) debuts in October and will provide users with the capability to search across a large compendium of peer reviewed Journal literature with a focus on Physical Sciences and Technology. Online access to the full text of many journal articles has been a much desired next step expressed by researchers. Although viewing of the full text from most publishers is not without cost, PubSCIENCE centralizes access to many journal articles without having to leave the office. In some ways, while journal literature has never been considered grey literature in the traditional definition, almost any paper-only product line could be considered grey literature in today's electronic environment.

In the international arena, ETDE has opted to have an Information Bridge-like product available to users in their countries beginning in late October. The product, to be called ETDEWEB or ETDE World Energy Base, will be offered from the ETDE Web Site at http://www.etde.org .

OSTI feels proud to have done it's part to meet the challenge of removing the "grey" from grey literature in energy by making it accessible. It has done so despite limited resources and with some difficult decision-making along the way. Without the commitment of a dedicated and capable staff and willing partners, grey would still be grey, and OSTI would be going down the path of the dinosaurs.