Skip Navigation
Lister Hill Center Logo  

Search Tips
About the Lister Hill Center
Innovative Research
Publications and Lectures
Training and Employment
LHNCBC: Document Abstract
Year: 2007Adobe Acrobat Reader
Download Free Adobe Acrobat Reader
LHNCBC-2007-006
Identification of 'comment-on sentences' in Online Biomedical Documents Using Support Vector Machines
Kim IC, Le DX, Thoma GR
Proc. SPIE conference on Document Recognition and Retrieval, vol. 6500, pp.65000O-1-65000O-8, San Jose, CA, January 2007
MEDLINE is the premier bibliographic online database of the National Library of Medicine, containing approximately 14 million citations and abstracts from over 4,800 biomedical journals. This paper presents an automated method based on support vector machines to identify a 'comment-on' list, which is a field in a MEDLINE citation denoting previously published articles commented on by a given article. For comparative study, we also introduce another method based on scoring functions that estimate the significance of each sentence in a given article. Preliminary experiments conducted on HTML-formatted online biomedical documents collected from 24 different journal titles show that the support vector machine with polynomial kernel function performs best in terms of recall and F-measure rates.
PDF