NLM Home REDEFINING A THESAURUS: TERM-CENTRIC NO MORE

Douglas Johnston, Stuart J. Nelson, MD, Jacque-Lynne Schulman, Allan G. Savage, Tammy P. Powell
The National Library of Medicine, Bethesda, Maryland
MeSH Home
MeSH Goal: To provide a reproducible partition of concepts relevant to biomedicine for the purpose of organization of medical knowledge and information.


INTRODUCTION
  Concept Example

CONCLUSIONS


As the MeSH thesaurus moves to a new maintenance environment, the structure of MeSH data is being revised to a more formal structure, centering around concepts and descriptor classes rather than only descriptors and terms. We see a descriptor as a class of concepts and a concept as a class of terms. [1,2] These classes can then each be represented by a unique numerical identifier that is independent of any term used to name the class and will persist through a change in such a name. This structure makes it possible to store additional useful data as well as storing existing data more efficiently.
• Expanding MeSH to explicitly include concepts makes more efficient the representation of existing information and allows representation of additional information about concepts not previously stored.

• The additional information about concepts makes it easier to identify concepts, when they are considered as candidates for separation into new descriptor classes. It also makes truly synonymous terms available for export to applications or databases that have a strict definition of synonymy, e.g., the UMLS Metathesaurus.

• At the same time, the new structure allows descriptors to remain as broad as needed for purposes of retrieval by grouping concepts as members of descriptor classes.


• Storing descriptor classes as UIs will make it easier to propagate changes in the descriptor names throughout the database. Instead of replacing each term, as is currently the case, only the table linking the UI with the preferred term need be changed. Referential integrity will thus be much easier to maintain
.
 DEFINITIONS    The New System Solves Problems  REFERENCES


Concept. A meaning named by a term.

Descriptor Class. A set of concepts closely related to each other in meaning. For the purposes of indexing and retrieval, these concepts are best lumped together. The traditional entry terms found in thesauri are often not synonymous; considering the concepts named by the terms as members of a larger class allows a more formal representation of the appropriate relationships.


Semantic Type. The basic category or categories of meaning of a concept. For example, Anemia is a "Disease or Syndrome".

Semantic Relation. A relation between two concepts. For example, "Narrower" means that the meaning of one concept is subsumed in the other.


Problem: Only two data objects: descriptors and terms
  • Cannot attach concept attributes (e.g., definition) to specific concept.
  • Cannot store relations between terms (e.g., synonymy).
  • Redundant storage of concept attributes in terms (e.g., SR).


Problem: Descriptors de-normalized Descriptors are often represented only by a term.

  • When a descriptor name changes, each occurrence of the name must be replaced throughout the database.


Solution: Add data object: concepts, as members of descriptor classes.
  • Attach concept attributes to the appropriate concept.

  • Terms grouped in concept classes of equivalent terms according to concept.

  • Concept attributes (e.g., SR) attached only to the concept.



Solution: Refer to descriptors by UI

  • For a descriptor name change, change only the link between the name and the descriptor UI.

1. Schuyler PL, Hole WT, Tuttle MS, Sherertz DD. The UMLS Metathesaurus: representing different views of biomedical concepts. Bull Med Lib Assoc 1993, 81(2):217-22.

2. Tuttle MS, Sperzel WD, Olson NE, Erlbaum MS, Suarez-Munist O, Sherertz DD, Nelson SJ, Fuller LF. The homogenization of the Metathesaurus schema and distribution format. Proceedings of the Annual Symposium on Computer Applications in Medical Care. 1992:299-303.
Last updated: 20 November 2001