Skip Navigation
Lister Hill Center Logo  

Search Tips
About the Lister Hill Center
Innovative Research
Publications and Lectures
Training and Employment
LHNCBC: Document Abstract
Year: 2006Adobe Acrobat Reader
Download Free Adobe Acrobat Reader
LHNCBC-2006-022
Automatic Extraction of Bibliographic Information from Biomedical Online Journal Articles Using a String Matching Algorithm
Kim J, Le DX, Thoma GR
Proc. 19th IEEE International Symposium on Computer-Based Medical Systems, June 2006, Salt Lake City, Utah; 905-10
A system has been developed to extract bibliographic data (grant numbers and databank accession numbers) from online biomedical journal articles for the National Library of Medicine's MEDLINE database. Rule-based algorithms and a string matching algorithm are proposed to extract the bibliographic data from HTML-formatted articles. Experiments conducted with 411 medical articles from 73 journal issues show an accuracy exceeding 96%.
PDF