November 10, 2012

Digital Preservation: NLM Launches Web Content Collecting Initiative

From the National Library of Medicine:

The National Library of Medicine (NLM) has launched a Web content collecting initiative. The Library is selecting Web content as part of its mission to collect, preserve, and make accessible the scholarly biomedical literature as well as resources that illustrate a diversity of philosophical and cultural perspectives not found in the technical literature. New forms of publication on the Web, such as blogs authored by doctors and patients, illuminate health care thought and practice in the 21st century. In launching this initiative, the Library is capturing and providing a unique resource for future scholarship.

The Library’s inaugural collection of Web content is “Health and Medicine Blogs,” presenting the perspectives of physicians, nurses, hospital administrators and other individuals in health care fields. The collection also includes patients chronicling their experiences with conditions such as cancer, diabetes and arthritis. The site currently contains 12 blogs, including KevinMD.com, “social media’s leading physician voice”; Not Running a Hospital, a blog by a former CEO of a large Boston hospital; e-patient Dave, a cancer survivor and leader in the participatory medicine movement; and Wheelchair Kamikaze, who writes about his personal experience living with multiple sclerosis (MS). The collection can be accessed from http://www.nlm.nih.gov/webcollecting.

Guided by the NLM Collection Development Manual and other strategic collecting efforts, NLM will continue to expand its capacity to collect Web content. With this initiative NLM has taken a major new step in its mission to collect pertinent health care information of today for the benefit of research in the future. Increasingly, that information is found on the Web, which is a rapidly changing environment where valuable and interesting materials can surface and then quickly disappear. The Library is working to ensure it can effectively collect new material in a Web environment, and guarantee the material’s permanence and availability to current and future patrons.

Direct to NLM Web Archive

The NLM web archiving initiative is utilizing Archive-It, a fee-based service from the Internet Archive. They do great work and we have utilized several of collections they make available to the general public many times when doing research. At the moment, more than 1700 collections are available from a wide range of sources.

One feature that makes Archive-It collections different than Wayback Machine archived pages (Wayback is another Internet Archive project) is that Archive-It archived pages are full text searchable. In other words, you can search for specific words/phrases on these pages.

Also worth mentioning, Archive-It offers a web archiving program for K-12 students. Details here.

Gary Price About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.