Skip Navigation
Lister Hill Center Logo  

Search Tips
About the Lister Hill Center
Innovative Research
Publications and Lectures
Training and Employment
LHNCBC: Document Abstract
Year: 2005Adobe Acrobat Reader
Download Free Adobe Acrobat Reader
LHNCBC-2005-022
Automated Labeling Of Biomedical Online Journal Articles
Kim J, Le DX, Thoma GR
In: Callaos N, Lesso W, editors. SCI 2005. Proc. 9th World Multiconference on Systemics, Cybernetics and Informatics; 2005 Jul 10-13; Vol. 4; Orlando (FL): International Institute of Informatics and Systemics; c2005. 406-11
An automated labeling (AL) module has been developed to automate the extraction of bibliographic data (e.g., article title, authors, affiliation, abstract, and others) from online biomedical journals for the National Library of Medicine's MEDLINE database. The AL module employs string matching, statistics, and fuzzy rule-based algorithms to identify segmented zones in an article's HTML pages as specific bibliographic data. Experiments conducted with 1,267 medical articles from 64 journal issues show about 97.71% accuracy.
PDF