REDEFINING A THESAURUS: TERM-CENTRIC NO
MORE Douglas Johnston, Stuart J. Nelson, MD, Jacque-Lynne Schulman, Allan G. Savage, Tammy P. Powell The National Library of Medicine, Bethesda, Maryland |
MeSH Goal: To provide a reproducible partition of concepts relevant to biomedicine for the purpose of organization of medical knowledge and information. | |||
INTRODUCTION |
CONCLUSIONS |
||
As the MeSH thesaurus moves to a new maintenance environment, the structure of MeSH data is being revised to a more formal structure, centering around concepts and descriptor classes rather than only descriptors and terms. We see a descriptor as a class of concepts and a concept as a class of terms. [1,2] These classes can then each be represented by a unique numerical identifier that is independent of any term used to name the class and will persist through a change in such a name. This structure makes it possible to store additional useful data as well as storing existing data more efficiently. |
Expanding MeSH to explicitly
include concepts makes more efficient the representation of existing
information and allows representation of additional information about concepts
not previously stored. The additional information about concepts makes it easier to identify concepts, when they are considered as candidates for separation into new descriptor classes. It also makes truly synonymous terms available for export to applications or databases that have a strict definition of synonymy, e.g., the UMLS Metathesaurus. At the same time, the new structure allows descriptors to remain as broad as needed for purposes of retrieval by grouping concepts as members of descriptor classes. Storing descriptor classes as UIs will make it easier to propagate changes in the descriptor names throughout the database. Instead of replacing each term, as is currently the case, only the table linking the UI with the preferred term need be changed. Referential integrity will thus be much easier to maintain. |
||
DEFINITIONS | The New System Solves Problems | REFERENCES | |
Concept. A meaning named by a term. Descriptor Class. A set of concepts closely related to each other in meaning. For the purposes of indexing and retrieval, these concepts are best lumped together. The traditional entry terms found in thesauri are often not synonymous; considering the concepts named by the terms as members of a larger class allows a more formal representation of the appropriate relationships. Semantic Type. The basic category or categories of meaning of a concept. For example, Anemia is a "Disease or Syndrome". Semantic Relation. A relation between two concepts. For example, "Narrower" means that the meaning of one concept is subsumed in the other. |
Problem: Only two data objects: descriptors and terms
|
Solution: Add data object: concepts, as members of descriptor classes.
Solution: Refer to descriptors by UI
|
1. Schuyler PL, Hole WT, Tuttle MS, Sherertz DD. The UMLS Metathesaurus: representing different views of biomedical concepts. Bull Med Lib Assoc 1993, 81(2):217-22. 2. Tuttle MS, Sperzel WD, Olson NE, Erlbaum MS, Suarez-Munist O, Sherertz DD, Nelson SJ, Fuller LF. The homogenization of the Metathesaurus schema and distribution format. Proceedings of the Annual Symposium on Computer Applications in Medical Care. 1992:299-303. |