Home Projects Publications Presentations Repositories Photo Gallery Career Staff Favorites
  • MyDelivery
  • Turning The Pages Online
  • MyMorph
  • Medical Article Records GROUNDTRUTH (MARG)
  • MD on Tap
  • AnatQuest
Links to Feeds:
PublicationsRSS  RSS
CEB NewsRSS  RSS

Last updated: April 27, 2009

Swine Flu Info HHS.gov CDC.gov Things You Can Do Investigation Information

Staff Bibliography

Back to previousBack to previous  Print this Print this  E-mail this E-mail this

Document Citation

Title:
Archiving a Historic Medico-legal Collection: Automation and Workflow Customization.

Author(s):
Misra D, Mao S, Rees J, Thoma GR.

Institution(s):
1) Lister Hill National Center for Biomedical Communications, National Library of Medicine Bethesda, Maryland 20894, USA.

Source:
Proc. IS&T Archiving 2007. Arlington, Virginia. May 2007:157-61.

Abstract:
The U.S. National Library of Medicine (NLM) has acquired a historical collection of documents, released by the Food and Drug Administration, specifying the Notices of Judgment (NJs) against manufacturers of adulterated or misbranded food, drugs and cosmetics. These documents, consisting of 70,000+ pages containing more than 65,000 NJs, are to be preserved and made accessible over the long term due to their legal and historical value. We developed a preservation system, named SPER (System for Preservation of Electronic Resources), based on DSpace infrastructure, for archiving and disseminating NJs contained in these documents. For efficiency and cost-effectiveness, we developed algorithms to automatically identify the NJs and extract metadata from their contents, and then have an archivist review and edit the metadata, and ingest the NJs into the archive. Contents of the documents are also captured as text streams to provide full-text search capability for the NJs. These functionalities required a number of changes to the open source DSpace software, including changing the ingest interface and workflow, handling metadata schema that does not map to Dublin Core, and enhancing the database schema. This paper describes the overall SPER system, customized workflow for automated metadata extraction, the automated metadata extraction process, and an estimate of labor savings through automation.

Publication Type: CONFERENCE


More about this article:

Full Text (PDF)

 
 

National Institutes of Health (NIH)National Institutes of Health (NIH)
9000 Rockville Pike
Bethesda, Maryland 20892

U.S. Dept. of Health and Human ServicesU.S. Dept. of Health
and Human Services

USA.gov Website