Word sense ambiguity in language constitutes a major impediment to
accurate management of biomedical text through automatic methods. The
Word Sense Disambiguation data test collection is a medical text test
collection in which ambiguities were resolved by hand
to support research investigating the automatic resolution of word sense
ambiguity.
The test collection consists of 50 ambiguous concepts from the 1998 Medline.
Ambiguous cases were created using the UMLS Metathesaurus concepts provided by MetaMap.
Each ambiguous case has 100 instances randomly selected from Medline citations, for a total of 5,000 instances.
|