U.S. National Library of Medicine Logo MBR Files

Users are responsible for compliance with all applicable MEDLINE®/PubMed® and other NLM Databases License Agreements.

The MEDLINE citations included in this MEDLINE Baseline Repository were retrieved in their respective years and represent a static view of MEDLINE at that time.



   Home   


MBR Files


MBR MeSH
Related Files


   MBR Query Tool   
(Restricted)


MBR Reference
Material
      Available Files:

The following files have been made available from the MEDLINE Baseline Repository and include all of the generated files we create during our processing of each baseline. We have used GNU's gzip utility to compress the larger files and used the Unix tar command to compile the files into a single download. The compressed files can be expanded using either GNU's gunzip utility or WinZip. WinZip will also be able to understand the tar files and separate out the files as appropriate.

To download each file, simply move your cursor over the file icon for the file you wish to download, press the right mouse button and select the "Save Link Target as ..." option.

The frequency count files represent a complete count of the MEDLINE baseline for each category. For example, the MH_freq_count file represents a count for each unique MeSH Heading found in MEDLINE for the given baseline year. We include an overall count for each term and a count of when the term has been starred (considered an IM or Index Medicus index term which represents the most significant points of an article) when applicable (MH and SH terms only). We provide at least two versions of each frequency count category which is just a sorting difference - MH_freq_count is ordered by the overall frequency count for each term, while MH_freq_alpha is ordered in alphabetical order by MeSH Heading. The MH_major_freq_count is sorted using the count of when each term has been starred. The README file associated with each baseline explains in greater detail the format of each of the files.

The raw data files represent the files we generated to use in our MEDLINE Baseline Repository Query tool database and it was felt that others might find these files useful and by providing them here, we help eliminate the duplication of effort. The README file associated with each baseline explains in greater detail the format of each of the files.

We also provide two files where we look at the MeSH Headings assigned to the completed citations during the given baseline or year. The "hist" file is a frequency count of MeSH Headings based on their assigned MeSH Treecodes. The "histST" file is a frequency count of MeSH Headings based on their UMLS Semantic Types and more specifically, their UMLS Semantic Groupings (groups of Semantic Types). We have also included several graphs to help illustrate this data.

The related MeSH Vocabulary data files are also included here to make sure that you have available all of the year specific data you might need for your research.

The SemGroups.txt file is the latest addition to the Repository and it's unclear whether this file is updated each year, as the Semantic Types change, or is static. This file has grouped the UMLS Semantic Types into 15 (currently) high-level categories. We are using this file to see if we can detect patterns in how the MeSH Headings are assigned in MEDLINE. The papers: "Aggregating UMLS semantic types for reducing conceptual complexity. McCray AT, Burgun A, Bodenreider O; Medinfo. 2001;10(Pt 1):216-20." and "Exploring semantic groups through visual approaches., Bodenreider O, McCray AT; Journal of Biomedical Informatics. 2003; 36(6):414-432." provide much greater detail on the grouping of the Semantic Types. Both papers can be found at the Lister Hill National Center for Biomedical Communications web site (http://lhncbc.nlm.nih.gov) under the "Publications & Lectures" section.

One or more of the following tools may be needed to access the files located on this page after they have been downloaded. The need depends on your current computer resources.

Get Adobe Reader button Adobe's free PDF reader "Adobe Reader"
Info on GNU gunzip gunzip -- to uncompress files
Info on Winzip WinZip -- to uncompress files

Item 20022003200420052006 20072008200920102011 20122013201420152016
DTD (Document Type Definition) Files                              
DTDs (gzipped tar) 2002 DTD files (gzipped tar)
(3.7 kb)
2003 DTD files (gzipped tar)
(3.8 kb)
2004 DTD files (gzipped tar)
(4.4 kb)
2005 DTD files (gzipped tar)
(4.1 kb)
2006 DTD files (gzipped tar)
(6.1 kb)
2007 DTD files (gzipped tar)
(4.3 kb)
2008 DTD files (gzipped tar)
(129 kb)
               
Frequency Count Files 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
README 2002 README file
(2 kb)
2003 README file
(2 kb)
2004 README file
(2 kb)
2005 README file
(2 kb)
2006 README file
(2 kb)
2007 README file
(2 kb)
2008 README file
(2 kb)
               
Summary 2002 Summary file
(228 by)
2003 Summary file
(228 by)
2004 Summary file
(228 by)
2005 Summary file
(228 by)
2006 Summary file
(228 by)
2007 Summary file
(228 by)
2008 Summary file
(228 by)
               
Chemical_freq_alpha 2002 Chemical_freq_alpha.gz file
(1.5 mb)
2003 Chemical_freq_alpha.gz file
(1.5 mb)
2004 Chemical_freq_alpha.gz file
(1.6 mb)
2005 Chemical_freq_alpha.gz file
(1.7 mb)
2006 Chemical_freq_alpha.gz file
(1.85 mb)
2007 Chemical_freq_alpha.gz file
(2.0 mb)
2008 Chemical_freq_alpha.gz file
(2.1 mb)
               
Chemical_freq_count 2002 Chemical_freq_count.gz file
(1.5 mb)
2003 Chemical_freq_count.gz file
(1.6 mb)
2004 Chemical_freq_count.gz file
(1.7 mb)
2005 Chemical_freq_count.gz file
(1.8 mb)
2006 Chemical_freq_count.gz file
(1.92 mb)
2007 Chemical_freq_count.gz file
(2.1 mb)
2008 Chemical_freq_count.gz file
(2.2 mb)
               
MH_freq_alpha 2002 MH_freq_alpha.gz file
(207 kb)
2003 MH_freq_alpha.gz file
(215 kb)
2004 MH_freq_alpha.gz file
(226 kb)
2005 MH_freq_alpha.gz file
(232 kb)
2006 MH_freq_alpha.gz file
(245 kb)
2007 MH_freq_alpha.gz file
(256 kb)
2008 MH_freq_alpha.gz file
(385 kb)
               
MH_freq_count 2002 MH_freq_count.gz file
(219 kb)
2003 MH_freq_count.gz file
(227 kb)
2004 MH_freq_count.gz file
(239 kb)
2005 MH_freq_count.gz file
(246 kb)
2006 MH_freq_count.gz file
(260 kb)
2007 MH_freq_count.gz file
(270 kb)
2008 MH_freq_count.gz file
(373 kb)
               
MH_major_freq_count 2002 MH_major_freq_count.gz file
(218 kb)
2003 MH_major_freq_count.gz file
(226 kb)
2004 MH_major_freq_count.gz file
(239 kb)
2005 MH_major_freq_count.gz file
(246 kb)
2006 MH_major_freq_count.gz file
(260 kb)
2007 MH_major_freq_count.gz file
(271 kb)
2008 MH_major_freq_count.gz file
(378 kb)
               
MH_SH_freq_alpha 2002 MH_SH_freq_alpha.gz file
(2 mb)
2003 MH_SH_freq_alpha.gz file
(2.1 mb)
2004 MH_SH_freq_alpha.gz file
(2.2 mb)
2005 MH_SH_freq_alpha.gz file
(2.3 mb)
2006 MH_SH_freq_alpha.gz file
(2.37 mb)
2007 MH_SH_freq_alpha.gz file
(2.45 mb)
2008 MH_SH_freq_alpha.gz file
(2.52 mb)
               
MH_SH_freq_count 2002 MH_SH_freq_count.gz file
(3 mb)
2003 MH_SH_freq_count.gz file
(3.1 mb)
2004 MH_SH_freq_count.gz file
(3.3 mb)
2005 MH_SH_freq_count.gz file
(3.4 mb)
2006 MH_SH_freq_count.gz file
(3.5 mb)
2007 MH_SH_freq_count.gz file
(3.67 mb)
2008 MH_SH_freq_count.gz file
(3.78 mb)
               
MH_SH_major_freq_count 2002 MH_SH_major_freq_count.gz file
(3 mb)
2003 MH_SH_major_freq_count.gz file
(3.2 mb)
2004 MH_SH_major_freq_count.gz file
(3.29 mb)
2005 MH_SH_major_freq_count.gz file
(3.41 mb)
2006 MH_SH_major_freq_count.gz file
(3.54 mb)
2007 MH_SH_major_freq_count.gz file
(3.68 mb)
2008 MH_SH_major_freq_count.gz file
(3.79 mb)
               
SH_freq_alpha 2002 SH_freq_alpha.gz file
(1.3 kb)
2003 SH_freq_alpha.gz file
(1.3 kb)
2004 SH_freq_alpha.gz file
(1.3 kb)
2005 SH_freq_alpha.gz file
(1.3 kb)
2006 SH_freq_alpha.gz file
(1.3 kb)
2007 SH_freq_alpha.gz file
(1.3 kb)
2008 SH_freq_alpha.gz file
(1.3 kb)
               
SH_freq_count 2002 SH_freq_count.gz file
(1.3 kb)
2003 SH_freq_count.gz file
(1.3 kb)
2004 SH_freq_count.gz file
(1.3 kb)
2005 SH_freq_count.gz file
(1.3 kb)
2006 SH_freq_count.gz file
(1.3 kb)
2007 SH_freq_count.gz file
(1.3 kb)
2008 SH_freq_count.gz file
(1.3 kb)
               
SH_major_freq_count 2002 SH_major_freq_count.gz file
(1.3 kb)
2003 SH_major_freq_count.gz file
(1.3 kb)
2004 SH_major_freq_count.gz file
(1.3 kb)
2005 SH_major_freq_count.gz file
(1.3 kb)
2006 SH_major_freq_count.gz file
(1.3 kb)
2007 SH_major_freq_count.gz file
(1.3 kb)
2008 SH_major_freq_count.gz file
(1.3 kb)
               
Raw Data Files 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
README 2002 README file
(3.1 kb)
2003 README file
(3.1 kb)
2004 README file
(3.1 kb)
2005 README file
(3.1 kb)
2006 README file
(3.4 kb)
2007 README file
(3.4 kb)
2008 README file
(3.4 kb)
               
Summary 2002 Summary file
(136 by)
2003 Summary file
(136 by)
2004 Summary file
(136 by)
2005 Summary file
(136 by)
2006 Summary file
(136 by)
2007 Summary file
(162 by)
2008 Summary file
(163 by)
               
Chemical_items 2002 Chemical_items.gz file
(162 mb)
2003 Chemical_items.gz file
(180 mb)
2004 Chemical_items.gz file
(195 mb)
2005 Chemical_items.gz file
(207 mb)
2006 Chemical_items.gz file
(229 mb)
2007 Chemical_items.gz file
(243 mb)
2008 Chemical_items.gz file
(260 mb)
               
ID_items 2002 ID_items.gz file
(42 mb)
2003 ID_items.gz file
(45 mb)
2004 ID_items.gz file
(48 mb)
2005 ID_items.gz file
(52 mb)
2006 ID_items.gz file
(57 mb)
2007 ID_items.gz file
(61 mb)
2008 ID_items.gz file
(68 mb)
               
Full_MH_SH_items 2002 Full_MH_SH_items.gz file
(995 mb)
2003 Full_MH_SH_items.gz file
(1 gb)
2004 Full_MH_SH_items.gz file
(1.1 gb)
2005 Full_MH_SH_items.gz file
(1.2 gb)
2006 Full_MH_SH_items.gz file
(1.3 gb)
2007 Full_MH_SH_items.gz file
(1.3 gb)
2008 Full_MH_SH_items.gz file
(1.4 gb)
               
MH_items 2002 MH_items.gz file
(1.5 gb)
2003 MH_items.gz file
(1.7 gb)
2004 MH_items.gz file
(1.7 gb)
2005 MH_items.gz file
(1.9 gb)
2006 MH_items.gz file
(2.1 gb)
2007 MH_items.gz file
(2.1 gb)
2008 MH_itemsA.gz file
(1 gb)
               
MH_items (Part II starts with 2008) N/A N/A N/A N/A N/A N/A 2008 MH_itemsB.gz file
(1.1 gb)
               
MH_SH_items 2002 MH_SH_items.gz file
(511 mb)
2003 MH_SH_items.gz file
(544 mb)
2004 MH_SH_items.gz file
(580 mb)
2005 MH_SH_items.gz file
(610 mb)
2006 MH_SH_items.gz file
(660 mb)
2007 MH_SH_items.gz file
(702 mb)
2008 MH_SH_items.gz file
(749 mb)
               
SH_items 2002 SH_items.u.gz file
(161 mb)
2003 SH_items.u.gz file
(175 mb)
2004 SH_items.u.gz file
(188 mb)
2005 SH_items.u.gz file
(188 mb)
2006 SH_items.u.gz file
(214 mb)
2007 SH_items.u.gz file
(233 mb)
2008 SH_items.u.gz file
(260 mb)
               
Histogram/Summary Files 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
README 2002 README file
(2.9 kb)
2003 README file
(2.9 kb)
2004 README file
(2.9 kb)
2005 README file
(2.9 kb)
2006 README file
(2.9 kb)
2007 README file
(2.9 kb)
2008 README file
(2.9 kb)
               
hist 2002 hist file
(4.8 kb)
2003 hist file
(4.8 kb)
2004 hist file
(4.9 kb)
2005 hist file
(4.8 kb)
2006 hist file
(4.5 kb)
2007 hist file
(4.4 kb)
2008 hist file
(4.4 kb)
               
Graph of hist File PDF: Graph of 2002 hist file
(5.1 kb)
PDF: Graph of 2003 hist file
(5.1 kb)
PDF: Graph of 2004 hist file
(5.0 kb)
PDF: Graph of 2005 hist file
(5.1 kb)
PDF: Graph of 2006 hist file
(5.4 kb)
PDF: Graph of 2007 hist file
(7.4 kb)
PDF: Graph of 2008 hist file
(6.1 kb)
               
Graph of Combined hist Files N/A N/A PDF: 2004 Graph of combined hist file
(5.8 kb)
PDF: 2005 Graph of combined hist file
(6.4 kb)
PDF: 2006 Graph of combined hist file
(8 kb)
PDF: 2007 Graph of combined hist file
(44 kb)
PDF: 2008 Graph of combined hist file
(10 kb)
               
histST 2002 histST file
(309 by)
2003 histST file
(309 by)
2004 histST file
(309 by)
2005 histST file
(311 by)
2006 histST file
(311 by)
2007 histST file
(311 by)
2008 histST file
(312 by)
               
Graph of histST File PDF: Graph of 2002 histST file
(3.8 kb)
PDF: Graph of 2003 histST file
(3.8 kb)
PDF: Graph of 2004 histST file
(3.8 kb)
PDF: Graph of 2005 histST file
(3.8 kb)
PDF: Graph of 2006 histST file
(3.7 kb)
PDF: Graph of 2007 histST file
(4.9 kb)
PDF: Graph of 2008 histST file
(3.7 kb)
               
Graph of Combined histST Files N/A N/A PDF: 2004 Graph of combined histST file
(4.7 kb)
PDF: 2005 Graph of combined histST file
(5.2 kb)
PDF: 2006 Graph of combined histST file
(6.2 kb)
PDF: 2007 Graph of combined histST file
(38 kb)
PDF: 2008 Graph of combined histST file
(7.8 kb)
               
hist_Full N/A N/A 2004 hist_Full file
(33.2 kb)
2005 hist_Full file
(34.2 kb)
2006 hist_Full file
(33.7 kb)
2007 hist_Full file
(33.9 kb)
2008 hist_Full file
(34.6 kb)
               
Graph of hist_Full File N/A N/A PDF: Graph of 2004 hist_Full file
(24.1 kb)
PDF: Graph of 2005 hist_Full file
(24.7 kb)
PDF: Graph of 2006 hist_Full file
(27.1 kb)
PDF: Graph of 2007 hist_Full file
(218 kb)
PDF: Graph of 2008 hist_Full file
(27.9 kb)
               
histST_Full N/A N/A 2004 histST_Full file
(4.2 kb)
2005 histST_Full file
(4.3 kb)
2006 histST_Full file
(4.4 kb)
2007 histST_Full file
(4.5 kb)
2008 histST_Full file
(4.6 kb)
               
Graph of histST_Full File N/A N/A PDF: Graph of 2004 histST_Full file
(24.1 kb)
PDF: Graph of 2005 histST_Full file
(24.8 kb)
PDF: Graph of 2006 histST_Full file
(24.8 kb)
PDF: Graph of 2007 histST_Full file
(210 kb)
PDF: Graph of 2008 histST_Full file
(27.3 kb)
               
Related MeSH Files 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
Access to All MeSH Files Memorandum of Understanding
README 2002 README file
(1.3 kb)
2003 README file
(1 kb)
2004 README file
(1 kb)
2005 README file
(1 kb)
2006 README file
(1.1 kb)
2007 README file
(1.1 kb)
2008 README file
(1.1 kb)
               
streeYYYY.bin 2002 streeYYYY.bin file
(642 kb)
2003 streeYYYY.bin file
(678 kb)
2004 streeYYYY.bin file
(699 kb)
2005 streeYYYY.bin file
(715 kb)
2006 streeYYYY.bin file
(767 kb)
2007 streeYYYY.bin file
(787 kb)
2008 streeYYYY.bin file
(804 kb)
               
UMLS Semantic Groups File 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
SemGroups.txt 2004 SemGroups.txt file
(5.8 kb)
2005 SemGroups.txt file
(5.8 kb)
2006 SemGroups.txt file
(5.8 kb)
               

Last Modified: February 22, 2008 ii-public
Links to Our Sites
MetaMap Public Release
NEW: Distributable version of the actual MetaMap program.
Indexing Initiative (II)
Investigating computer-assisted and fully automatic methodologies for indexing biomedical text. Includes the NLM Medical Text Indexer (MTI).
Semantic Knowledge Representation (SKR)
Develop programs to provide usable semantic representation of biomedical text. Includes the MetaMap and SemRep programs.
MetaMap Transfer (MMTx)
Java-Based distributable version of the MetaMap program.
Word Sense Disambiguation (WSD)
Test collection of manually curated MetaMap ambiguity resolution in support of word sense disambiguation research.
Medline Baseline Repository (MBR)
Static MEDLINE Baselines for use in research involving biomedical citations. Allows for query searches and test collection creation.
Lister Hill Center Homepage Link - Image of Lister Hill Center Lister Hill National Center for Biomedical Communications   NLM Homepage Link - NLM Logo U.S. National Library of Medicine   NIH Homepage Link - NIH Logo National Institutes of Health
DHHS Homepage Link - DHHS Logo Department of Health and Human Services
     Contact Us    |   Copyright    |   Privacy    |   Accessibility    |   Freedom of Information Act    |   USA.gov    Get Acrobat Reader button