[ Back to the Table of Contents ]
EIDS Update
Remarks by T.C. Evans
Director, Office of Electronic Information Dissemination
Before the Federal Documents Task Force
Government Documents Round Table
American Library Association
Washington, DC
January 13, 2001
Introduction
As always, I appreciate the opportunity to update the library community on the current and future state of GPO Access. Since we last got together, many things have happened and many more are in the offing.
Of particular note is the death of Marty Mehlberg, who as the manager of the Text Processing Section worked tirelessly to ensure the availability and integrity of GPO Access data. He is sorely missed. I would also like to note the recent retirement of another key cog in the birth and development of GPO Access, Russ Duncan. Russ headed the Graphic Systems Development Division and personally programmed many of the applications on GPO Access and he will also be greatly missed.
Size
GPO Access continues to grow, with over 1,700 official government databases offered through some 80 applications. At this time, over 200,000 electronic titles are available through the FDLP Electronic Collection, with more than 116,000 titles on GPO servers and almost 84,000 titles linked to from GPO Access.
Usage
GPO Access usage continues to amaze, with recent months bringing us to some significant milestones. The more than 26 million retrievals in October propelled total usage of GPO Access to over 1 billion documents retrieved since the service premiered in 1994. The average number of monthly retrievals is steady at just above 26 million and the average size of these documents is currently about 49Kb. According to the Center for Advanced Computing Research at CalTech, 2Kb equals one typewritten page. Therefore the average document retrieved from GPO Access equates to some 24.5 typewritten pages. This means that the average number of monthly retrievals from GPO Access measures almost 1.3 terabytes in size and is equivalent to 637 million typewritten pages.
As is usually the case, information on hot topics always brings on a burst of use. The Supreme Court decisions relating to the recent election certainly fit this profile. On the day after the Court released its final decision, the site recorded some 6.4 million page views, compared to 1.5 million page views during an average month. User Support contacts went through the roof at the same time, with the GPO Access User Support Team handling nearly 5,500 e-mails from the Supreme Court site in a single week. This represented almost three times the monthly average number of e-mails.
Referrals to GPO Access from other Web Sites
We have begun monitoring the number of referrals to GPO Access from other Web sites and which sites are most often referring users to us. This is accomplished through the use of referral logs that record the host domain from which a referred user was directed to one of the pages on GPO Access. It is important to note, however, that this does not measure the number of links established on other sites using the GETDOC feature that pull documents directly from our databases. It has been most gratifying to see just how many referrals occur in the short time we have been analyzing these logs, as well as the broad array of sites who direct users to us.
The numbers have remained remarkably consistent for the first two months we have analyzed. For October and November we averaged some 600,000 referrals, with about 58% of those coming from no specific referrer. This group includes referrals from favorite lists stored on Web browsers, search engines who do not forward referral information from their results lists, and others in which no information is provided from which the referrer can be identified. There is another three to four percent whose address cannot be resolved from the information provided.
The largest identifiable group of referrals come from other Government sites. Representing more than 17% of the total, this is clear evidence of how the information of GPO Access is used to facilitate the missions of agencies across Government. It is also evidence of the broad diversity of the constituencies who are served through use of the products and services we provide.
Next in the referral pecking order are the dot-coms at between 11 and 12%. Heading the list are a string of popular search engines, led in both months by Google. It will be important to remember this category when I discuss our search engine project in a few minutes.
No other category approaches 10%. This includes education (.edu) addresses at a little more than 5% and organizations (.org) at just under 2%. Based on a list of domain addresses for Federal depository libraries maintained by the Library Programs Service, depository sites account for approximately 3% of the total referrals to GPO Access pages. Top among the more than 500 depository addresses are sites at the University of Maryland, the University of Michigan, Louisiana State University, the University of North Texas, and Vanderbilt University. There are about 400, however, which send more than five referrals per month, which is the number commonly suggested to be the best indicator of at least one prominent link at the site.
Part of the impetus for reviewing these referral logs came from a request to determine how many referrals we have been receiving from FirstGov. In the first two months, their totals have consistently represented about one half of one percent of the total referrals received. It is interesting that we see FirstGov referrals from 11 different addresses that are not redirects. This means that they are maintaining these as separate individual sites.
As more data is received and time permits, additional analysis will be performed. I will continue to report these results as they become available.
System Performance
System performance has improved and efforts to enhance system response time continues. The increased bandwidth easily withstood the onslaughts during the election, although a severe strain was placed on our server farm. At one point during the process, the Supreme Court materials were being served from 19 servers through the server controller array. We are currently exploring a relationship with a prominent content delivery network that should produce dramatic additional improvement in service from GPO Access. This will allow for copies housed on some 8,500 servers located at Internet service providers around the world to quickly supply copies of large and popular files to nearby users, while greatly reducing the load on our server farm.
What’s new on GPO Access
There are a number of recent changes to GPO Access that should be mentioned. The most notable are:
- The Economic Report of the President 2001 is now available.
- A new browse feature is available for the United States Code. The browse feature allows users to browse individual U.S. Code titles, down to the section level, for the latest available update.
- The United States Government Printing Office Style Manual, 2000 is available.
- The United States Government Policy and Supporting Positions, 2000 (the Plum Book) is available.
- All volumes of The Public Papers of the Presidents of the United States covering the period of 1994 through 1998 are now available.
- A new browseable table of contents feature is available on the Weekly Compilation of Presidential Documents, beginning with the first issue of 2001.
What’s on the Horizon for GPO Access
As always, work is under way to add more content to GPO Access and to refine access to the materials already provided. Some key examples of current efforts are:
- An agreement with the Department of Labor to put the Davis-Bacon Wage Determination materials on GPO Access has been reached and a written memorandum of understanding has been delivered for signature. The application has been built and reviewed by Labor, and the release date is currently scheduled to coincide with the release of the new basic manual in early February.
- The FY 2002 Economic Outlook, Highlights from 1994 to 2001, FY 2002 Baseline Projections will tentatively be available on GPO Access and for sale through the Superintendent of Documents, January 16, 2001.
EIDS staff is in the process of evaluating Helpdesk software for customer support that will further improve customer service available through the GPO Access User Support Team.
- An eCFR application, which will be updated daily as opposed to the current quarterly updated Code of Federal Regulations application, should be fully available by summer.
- As a result of the development of the free eCFR application, the Sales program is developing a new e-mail subscription service. Customers will be able to purchase subscriptions that will allow them to be notified via e-mail of any changes in one or more CFR titles and/or parts, as they are published in the Federal Register.
Search Engine Project
The fifth installment of our ongoing effort to improve the accessibility of GPO Access resources through popular search engines has been completed. Although the full report is available on the Federal Bulletin Board, the following stood out among the results of this effort:
- The numbers indicated that overall performance again declined, with test searches returning a top-30 hit only 25% of the time. Top-10 returns dropped to 21% for all searches.
- Of the seven GPO Access pages studied, four did improve in top-30 performance (U.S. Government Online Bookstore 67%, Ben’s Guide to the U.S. Government 43%, The Catalog of U.S. Government Publications 32%, and the Federal Register 15%), but three did worse (CBDNet –3%, GPO Access Home Page –23%, and the Congressional Record –23%). Sadly, the decreases were in the pages that are most successful, bringing the overall average down.
- Of the 23 search engines studied, 12 increased in top-30 performance, 10 decreased, and two remained the same. GoogleUncleSam was far and away the best performer, as 46% of the searches yielded the appropriate GPO Access page in the search results. The next best was Excite at 37%, then Magellan at 34%, and Google, Lycos, and MSN Search rounded out the top five at 31% each. FirstGov finished in a tie for 10th at 26%. Yahoo, the search engine/directory/portal discussed in an interesting article by Laura Cohen in the January issue of American Libraries, found the appropriate GPO Access page only 20% of the time.
- It was clear that what we have done to date is not working and that it is not easy for potential users to find the resources of GPO Access through these search engines.
- In addition to search engines, directories and other portals were examined as well. Based on our initial exploration, it is clear that we need to learn more about them and how they work before we can adequately measure their performance as relates to GPO Access. Reportable results should come out of the sixth installment of the project.
As a result of these findings, we have taken a number of steps to improve performance and the results of these actions will be measured as part of the sixth installment of the project. In doing so we reached the end of those things that could be done for free, so we have begun to test the use of methods which have a cost associated with them. The steps taken are:
- We revamped the meta tags imbedded in our study pages based on excellent feedback received in an open forum held at the Depository Library Conference in October.
- We have begun to insert Dublin Core metadata elements into major GPO Access pages to aid search engines index these resources.
- We have subscribed to a submission service that registers our pages with over 1,000 search engines and directories each month and provides us with reports on the success of their efforts.
- We have purchased and begun using software to continually provide fresh submissions of our own to more than 1,000 search engines and directories.
In addition to monitoring the effects of these efforts to improve our positioning, we are also continuing our research to find out as much as possible about how search engines and directories work. There are a number of challenges to this, including the fact these organizations are extremely reluctant to discuss their methods with us. Another disturbing challenge is the apparent commercialization of the process. There are increasing indications that the industry is moving more and more to allowing sites to purchase positioning through the use of techniques that afford them to achieve favored status in the indexing process as a result of buying keywords or advertising. We have begun exploring the process of keyword buying and plan to test the procedure, despite our philosophical disagreement with the practice, to see if it can improve performance.
The implications of this last trend are disturbing. We do not yet know how pervasive the practice is at this time, but we are compelled to learn about it and test it if we are to achieve the goal of improving the visibility of the products and services of GPO Access. The popularity of these imperfect tools demand it, since our most recent survey indicated that one third of the respondents stated that they had found GPO Access through a search engine.
Thank you for your attention and I urge you to stop by Booth Number 347 and see the additions and changes to GPO Access. As always, I want to thank you for your feedback and I look forward to discussing your ideas for a better GPO Access during the conference.
[ Back to the Table of Contents ]
Passion in a Noble Cause: GPO Responds
Reprinted from the January 2001 issue of Searcher, the Magazine for Database Professionals, with the permission of the publisher, Information Today, Inc., 143 Old Marlton Pike, Medford, NJ 08055; 609/654-6266; <http://www.infotoday.com>.
Why "Not"? A "Quint's Online" Backlash
Editor's Comment: Some of my more faithful readers may peruse the column I write for sister publication, Information Today, called - modestly - "Quint's Online.' In the October 2000 issue of that magazine (vol. 17, issue 9), I wrote a piece urging the Federal Government to make the Internet their primary publication medium ("Your Tax Dollars at Work"). [If you want to read a copy of that column, click on http://www.infotoday.comlit/oct00/quint.htm.] The column stemmed from musings generated by my participation in a report to Congress by the National Commission on Libraries and Information Science (NCLIS).
In one rather off-hand sentence in that column, I discussed the existing Federal agencies that might take a leadership role in advancing this policy. And in a very off-handed aside, I rejected the U.S. Government Printing Office (GPO) as a candidate. ("Perhaps this could become a new role for a Web-oriented NTIS or for the U.S. Government Printing Office (not), or even the -National Archives and Records Administration (NARA).")
Well, the GPO called me on that snide aside - as you can see by the letter below. And I'm overjoyed to see that this government agency seems prepared to defend its role in Federal Internet policy like a lioness guarding her cubs. The letter below contains some useful truths for searchers reaching for Federal data, but, personally, I love its passion in a noble cause--building sound archives and broadening public access by bringing the Feds to the Net.
[GPO]
Sorry for the delay in responding, but this concerns Barbara Quint’s article, "Your Tax Dollars at Work: The Internet Should Serve as the U.S. Government’s Primary Archive," in the October 2000 edition of Information Today. In that article, Ms. Quint calls for the Federal Government "to move to the Web, big time, to ensure the performance of its mission of service to the people of the U.S." But when she suggests a general oversight role for the Government Printing Office (GPO) in making government information available via the Web, Ms. Quint herself says "not!" Quite frankly, that rather casual remark puzzled us, in view of our extensive involvement in Web-based government information dissemination for much of the past decade.
GPO operates GPO Access [at www.access.gpo.gov/su_docs], one of the few government Web sites actually established by law, and one of the longest running, beginning operation in 1994. It is virtually the only government Web site that provides easy, one-stop, no-fee access to information from all three branches of the government, including the daily Congressional Record and Federal Register and Supreme Court reports, as well as a wide variety of other government information products (incidentally, we just put up the Plum Book, the listing of Federal jobs open for appointment under the incoming administration).
Today, GPO Access links the public to nearly 200,000 individual titles on GPO's servers and other Federal Web sites. The titles available from GPO's servers include those put up on agency Web sites hosted by GPO. Moreover, more than 40 percent of the titles available via GPO Access are linked from other Federal agency Web sites, the result of a lot of knocking on doors by GPO to grow a comprehensive collection of Federal information for public access. The public uses the system heavily. Monthly document retrievals today average more than 26 million. As for the information that isn't available on GPO Access yet--such as that detailed in a recent "filegate.gov" article [Wired magazine article]--much of that has more to do with congressional rules on access than with the Web sites on which it is placed--or not placed.
Ms. Quint's article rightly talks at length about the need for permanent public access to Web-based information. In an age when there are thousands of Federal Web sites, on which important public documents appear and disappear with alarming frequency--leading to growing speculation that the current era will one day be known as an enormous "black hole' in the non-availability of government information--GPO Access is one of the very few government Web sites to make a concerted public commitment to permanent public access. That means once a document goes on GPO Access, it stays there. Every issue of the Congressional Record, the Federal Register, congressional bills, and other documents since GPO Access went live in 1994 can still be found there. Every bit of the "Starr Report" and accompanying documents, if anyone is still interested, is still there, as well as the Cox Committee report on China, the Microsoft decisions, and other documents. Our commitment to permanent public access is spelled out on our Web site at <http://www.access.gpo.gov/ppa/>, and our position on archiving is stated in our Electronic Collection plan at <http://www.access.gpo.gov/su_docs/fdlp/pubs/ecplan.html>. These policies distinguish GPO Access from the vast number of other Federal Web sites in their commitment to permanent public access to online government information.
The fact that printing is still part of GPO's name has led some people to wonder whether we have been able to shake free of our paper-based past and participate fully in the Web-based information arena. They shouldn't wonder any longer. Today, all major documents processed through GPO have Web and print dissemination channels (and some have CD-ROM distribution as well), and GPO services other government Web sites as well as its own; the Library of Congress’s Thomas site, for example, uses congressional information databases built by GPO. Consider the record of GPO's activity in the Web information arena, and both government and public reactions to it:
- This year, GPO's experience in Web operations led us to assist the Supreme Court in the development and release of its widely heralded new Web site. Throughout the Microsoft case, the U.S. District Court for the District of Columbia utilized GPO Access for the public release of its decision documents online. Early in the year, GPO made Ben’s Guide to U.S. Government for Kids available on GPO Access, at <http://bensguide.gpo.gov>, which has since drawn rave reviews from the library, education, and even legal communities.
- In 1999, GPO Access was selected as one of the top 50 legal research Web sites for the year by Law Office Computing magazine, and was named best research site for laws and best government site overall by the newsletter legal.online. It was chosen as the first recipient of the American Association of Law Libraries Public Access to Government Information Award. The Energy Department's PubSCIENCE project, mentioned approvingly by Ms. Quint, is a joint project with GPO begun in 1999 and is available on GPO Access. Also in 1999, GPO and the Department of Energy jointly won a Hammer Award from Vice President Gore’s National Performance Review for the Information Bridge, a partnership that makes thousands of DOE scientific and technical reports available over GPO Access.
- In 1998, GPO Access was named one of the 15 "Best Feds on the Web" by Vice President Gore and Government Executive magazine. Federal Computer Week magazine said, "The GPO site stands out as an unassuming, information-rich offering." The internationally recognized management firm of Booz-Allen & Hamilton, Inc., called GPO Access "one of the Federal Government's largest and most active Web sites" and said that the site "has been highly successful in making government information easily available to the public."
- In 1997, GPO Access and the Commerce Department jointly earned a Hammer Award for creating the electronic Commerce Business Daily, known as CBDNet. Other awards have included a 1994 Technology Leadership Award and the prestigious 1995 James Madison Award from the Coalition on Government Information.
So extensive is this record, in fact, that when the Commerce Department announced the demise of the National Technical Information Service (NTIS) in August 1999, GPO used its experience in the Web-based dissemination of government information to offer to continue making the NTIS permanent collection of scientific and technical information available to the public. The only difference is that we would make it freely available to the public through our Federal Depository Library System and online--a project that we have in fact been working on with NTIS on a pilot basis--whereas in the past it has only been available through NTIS for a fee.
We at the GPO are justifiably proud of what we have accomplished in making government information available over the Web and of our commitment to ensure that it is available permanently. We invite Ms. Quint and all of Information Today's readers to visit GPO Access and see for themselves one of the Federal Government's premier Web sites.
Andrew M. Sherman
Director
Office of Congressional and Public Affairs
U.S. Government Printing Office
[ Back to the Table of Contents ]
Upcoming FDLP Events
2001
Spring Council Meeting
April 1-4 San Antonio, TX
Interagency Depository Seminar
May 30-June 6 Washington, DC
Regionals Meeting
October 14 Alexandria, VA
Federal Depository Conference /
Fall Council Meeting
October 14-17 Alexandria, VA
2002
Spring Council Meeting
April 21-24, 2002 Mobile, AL
Federal Depository Conference /
Fall Council Meeting
October 20-23 Arlington, VA