LHNCBC: Document Abstract

|

|

FAQs


	Home
	Welcome
	Organization
	Visitor Information
	Staff Directory

	Medical Informatics
	Language & Knowledge Processing
	Image Processing
	Information Systems
	Infrastructure Research
	Multimedia Visualization

	Published Articles
	Technical Reports
	Lectures

	Training Opportunities
	Employment Opportunities

LHNCBC: Document Abstract

Year: 2006	Download Free Adobe Acrobat Reader
LHNCBC-2006-002
Word Sense Disambiguation by Selecting the Best Semantic Type Based on Journal Descriptor Indexing: Preliminary Experiment
Humphrey SM, Rogers WJ, Kilicoglu H, Demner-Fushman D, Rindflesch TC
Journal of the American Society for Information Science and Technology. 2006 Jan 1;57(1):96-113.
An experiment was performed at the National Library of Medicine (NLM) in word sense disambiguation (WSD) using the Journal Descriptor Indexing (JDI) methodology. The motivation is the need to solve the ambiguity problem confronting NLM's MetaMap system, which maps free text to terms corresponding to concepts in NLM's Unified Medical Language System (UMLS) Metathesaurus. If the text maps to more than one Metathesaurus concept at the same high confidence score, MetaMap has no way of knowing which concept is the correct mapping. We describe the JDI methodology, which is ultimately based on statistical associations between words in a training set of MEDLINE citations and a small set of journal descriptors (assigned by humans to journals per se) assumed to be inherited by the citations. JDI is the basis for selecting the best meaning that is correlated to UMLS semantic types (STs) assigned to ambiguous concepts in the Metathesaurus. For example, the ambiguity transport has two meanings: 'Biological Transport' assigned the ST Cell Function and 'Patient transport' assigned the ST Health Care Activity. A JDI-based methodology can analyze text containing transport and determine which ST receives a higher score for that text, which then returns the associated meaning, presumed to apply to the ambiguity itself. We then present an experiment in which a baseline disambiguation method was compared to four versions of JDI in disambiguating 45 ambiguous strings from NLM's WSD Test Collection. Overall average precision for the highest-scoring JDI version was 0.7873 compared to 0.2492 for the baseline method, and average precision for individual ambiguities was greater than 0.90 for 23 of them (51%), greater than 0.85 for 24 (53%), and greater than 0.65 for 35 (79%). On the basis of these results, we hope to improve performance of JDI and test its use in applications.
PDF

Lister Hill National Center for Biomedical Communications
U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894
National Institutes of Health, Department of Health & Human Services
Copyright, Privacy, Accessibility, Freedom of Information Act
USA.gov, Applications & Plug-Ins
Site last updated: 30 January 2009