Skip to Content

XML MeSH, 2007. Documentation and Availability

1. General

XML MeSH contains all currently maintained MeSH data. A list of XML Data Elements is available. Also available is a conversion table which lists ELHILL MeSH and ASCII MeSH elements, with the corresponding XML MeSH tag. XML also includes data which are not in earlier MeSH formats, such as the concept structure similar to the UMLS. For more detailed information on the concept structure as well as the XML format, a background narrative is available.

2. Restrictions on use.

There is no charge. Use of the XML MeSH file data is subject to conditions which are detailed in the Memorandum of Understanding.

3. Availability

The data for Descriptors and Qualifiers are updated annually and users of the data are encouraged to obtain the new year's data.

Supplementary Concept Records (formerly Supplementary Chemical Records) are updated in-house on a daily basis and are released in XML weekly (Sunday). They are coordinated with 2007 MeSH descriptors so that the data elements that refer to specific descriptor, such as the <HeadingMappedTo> element, have been updated to match a descriptor in 2007 MeSH.

MeSH Descriptors and Qualifiers are also published annually in a printed version, including MeSH Trees.

4. File format

4.1 Compressed files

Because the XML MeSH files are considerably larger than ASCII MeSH files, the Descriptor and SCR files are available in both compressed format (ZIP and GZ) format and full format. The file of Qualifiers is the full XML file.

4.2 ASCII and UTF-8

Almost all data in XML MeSH files are in 7-bit ASCII format. The only exceptions are some non-print entry terms and Descriptor Annotations which contain one or more diacritical characters, such as an umlaut "o". These are coded in UTF-8 format and will be correctly displayed by UTF-8 applications. Otherwise they may appear differently in different displays.

4.3 XML tagged format

Like all XML data, XML MeSH data consist in text bounded by beginning and end tags specific for each data element. The tag for a Descriptor record for example is:

 <DescriptorRecord ...> ... </DescriptorRecord>

An example of a term is:

 <String>Heart</String>

Each data element or occurrence is contained on a single line but this is not required by XML format which uses the end tag to unambiguously mark the end of data.

For more detailed information on the XML format of MeSH a background narrative is available.

5. Contents of files. 2007 MeSH.

Record Type Total
Records
File size1 Bytes1 ZIP
file
GZ
file
Descriptors 24,357 260MB 272,636,172 14MB 14MB
Qualifiers 83 461KB 471,753 --- ---
Supplementary
Concept Records 2
164,042 474MB 497,162,442 32MB 32MB

ZIP, GZ =  compressed formats.

All sizes apply to files on the NLM Unix file server. (So byte counts do not include characters for CR or EOF.)

1 Uncompressed.
2 Formerly Supplementary Chemical Records.

6. For questions concerning the content of XML MeSH, contact:

Stuart Nelson, M.D.
Head, Medical Subject Headings
National Library of Medicine
8600 Rockville Pike
Bethesda, MD 20894

voice: 301-496-1495; FAX: 301-402-2002
email: nelson@nlm.nih.gov

For questions concerning distribution, format, etc., contact:
Jacque-Lynne Schulman
Medical Subject Headings
voice: 301-496-1495; FAX: 301-402-2002
email: schulman@nlm.nih.gov

Last reviewed: 02 October 2006
Last updated: 02 October 2006
First published: 27 September 2006
Metadata| Permanence level: Permanent: Dynamic Content