Skip Navigation
Lister Hill Center Logo  

Search Tips
About the Lister Hill Center
Innovative Research
Publications and Lectures
Training and Employment
LHNCBC: Document Abstract
Year: 2004Adobe Acrobat Reader
Download Free Adobe Acrobat Reader
LHNCBC-2004-059
Automated Labeling for Biomedical Journals Published in Foreign Languages
Kim J, Le DX, Thoma GR
Proc. 8th World Multiconference on Systemics, Cybernetics and Informatics. 2004 Jul.;:444-9.
An automated labeling (AL) module is developed to produce bibliographic records such as English title, vernacular title, author, affiliation, and English abstract from biomedical articles published in foreign language journals. Optical character recognition (OCR) output from scanned biomedical journals is used in this labeling process. Since frequently occurring words in a zone are important features, word lists are used as key features in the AL module. The AL module uses geometric and contextual features, and geometric relations between zones, as the basis for the rule-based labeling algorithms in the module. The algorithms uses 131 rules derived for foreign language journals. Experiments conducted with several medical journal articles show about 95% accuracy.
PDF