Skip Navigation
Cognitive Science Branch  

Search Tips
CgSB: Word Sense Disambiguation
Word Sense Disambiguation

Word sense ambiguity in language constitutes a major impediment to accurate management of biomedical text through automatic methods. The Word Sense Disambiguation data test collection is a medical text test collection in which ambiguities were resolved by hand to support research investigating the automatic resolution of word sense ambiguity.

The test collection consists of 50 ambiguous concepts from the 1998 Medline. Ambiguous cases were created using the UMLS Metathesaurus concepts provided by MetaMap. Each ambiguous case has 100 instances randomly selected from Medline citations, for a total of 5,000 instances.