Home

MetaMap 2009
(30 Jul 2009)

MetaMap 2008 v2
(25 Mar 2009)

MetaMap 2008
(24 Sep 2008)

MetaMap 2007
(24 Sep 2008)


About MetaMap

MetaMap is a highly configurable program developed by Dr. Alan (Lan) Aronson at the National Library of Medicine (NLM) to map biomedical text to the UMLS Metathesaurus or, equivalently, to discover Metathesaurus concepts referred to in text. MetaMap uses a knowledge intensive approach based on symbolic, natural language processing (NLP) and computational linguistic techniques. Besides being applied for both IR and data mining applications, MetaMap is one of the foundations of NLM's Medical Text Indexer (MTI) which is being applied to both semiautomatic and fully automatic indexing of biomedical literature at NLM. For more information on MetaMap and related research, see the the SKR Research Information Site. (http://skr.nlm.nih.gov/papers/index.shtml)

What's New in MetaMap?

July 30, 2009 - MetaMap 2009 Released

New in this Release

  1. NegEx enhancements,
  2. More options for XML generation,
  3. A new sentence-breaking algorithm,
  4. A new input format,
  5. Elimination of Moderate Model, and
  6. Elimination of display_original_phrases command-line option.
Other less visible changes, which will be mentioned but not described further, are various bug fixes involving the exclude_sources option, the display_original_phrase option, the term_processing option, positional information, and Fielded MMI Output. We also added several new Acronym/Abbreviation-detection rules. Finally, we upgraded MetaMap from Berkeley DB 3.0.55 to 4.1.24; this last change is completely transparent to users, but it will require keeping multiple versions of the database files if you want to run both MetaMap09 and any previous release on the same filesystem. For more information on the changes in MetaMap since the previous release, see the MetaMap 2009 Release Notes.

Caveats

Previous MetaMap databases are not compatible with this version of MetaMap. This version of MetaMap uses data indexes that have additional information within them. If you have used one of the Optional Data Models in the past, you will need to download the new versions from the MetaMap website.

Important Notes

Avenues to MetaMap:

Web Access Our Semantic Knowledge Representation (SKR) website provides both Interactive and Batch facilities that allow users to send text to our internal machines and run various programs including the MetaMap program. The Interactive facility is designed for testing options and running small amounts of text. The Batch facility runs large amounts of text through our Scheduler program which distributes the workload over a large pool of clients. GO TO SKR
MetaMap Distributable version of the original Prolog MetaMap program. Currently only includes binary distribution for Solaris and Linux platforms. GO TO MetaMap
SKR API Java-based API to the SKR Scheduler facility was created to provide users with the ability to programmatically submit jobs to the Scheduler Interactive and Batch facilities instead of using the web-based interfaces. We have tried to reproduce full functionality for all of the programs under the SKR Scheduler umbrella. The SKR API has been tested on the Solaris, Linux, and Windows XP platforms. GO TO SKR API
NOTE: MMTx is no longer supported except for major bug fixes. We recommend all users switch to the downloadable MetaMap (described above) if possible.
MMTx MetaMap Transfer (MMTx) is a java-based distributable version of the MetaMap program. Includes binary and source distributions and is supported on Solaris, Linux, Windows, and Mac platforms. MMTx was an early attempt at providing a distributable version of MetaMap and is currently being phased out in favor of the original Prolog version of MetaMap. There are two reasons for the phase out of MMTx: 1) The original Prolog version of MetaMap is much faster, especially now with the new speed enhancements (V2). 2) We were never able to make the results the same between MMTx and MetaMap - there was always about a 20% difference in the overall results MMTx would produce. GO TO MMTx

New to MetaMap, or confused as to what you should use?

If you are new to MetaMap, or are unsure where to start, we would recommend starting with our research papers on MetaMap at the MetaMap section of the SKR Research Information Site (URL: http://skr.nlm.nih.gov/papers/index.shtml#MetaMap). We also recommend that people use our Interactive web interface first to get a feel for how MetaMap works and how the various options affect the results. Once you have a good feel for what you would like to do, the decision comes down to how much control you want of your data. If you need to run everything locally, you are going to need to download and install either our MetaMap program or our MMTx program depending on what is available for your platform of choice. If you do not need to maintain control of your data, we offer a Batch facility for processing large sets of data through our pool of clients. You can access our Batch facility through our SKR web site, or through our Java-based SKR API. In either case, your data is uploaded to our web site, processed by our Scheduler program, and then the results are provided to you. We maintain your data and results for a maximum of 15 days and then they are purged from our systems. Only you and our team have access to the data and results. We only review Batch jobs when there is a specific request, or we see a Batch job causing problems with the Scheduler.


MetaMap

Prerequisites:

Downloads:

PLEASE NOTE: The downloads are restricted and require a valid UMLSKS username and password! Please see the above list of prerequisites before attempting to download MetaMap.

NOTE: If you have already installed the 2008 MetaMap, you will just need to use the Binary Update download and Binary Update Installation instructions. You do not need to do a complete re-install.

Currently, each of the full downloads contains a binary version of MetaMap compiled specifically for either Linux or Solaris, and we have included the Strict Data Model for each of the years respectively. We plan to also include the Relaxed Data Model for each year and we are working on updating the DataFileBuilder to work with MetaMap. These features will be phased in as they are completed and tested.

Full Downloads:

Binary Update Downloads:

Installation:

Move the downloaded file into a directory where you want to install MetaMap. This directory will then be referred to as <parent_directory> throughout the rest of the installation instructions.

To extract the MetaMap distribution, use the following bunzip2 and tar commands substituting the appropriate name of the file you downloaded (e.g., public_mm_linux_2008.tar.bz2, public_mm_solaris_2008.tar.bz2, public_mm_linux_2007.tar.bz2, or public_mm_solaris_2007.tar.bz2):

% bunzip2 -c public_mm_<platform>_<year>.tar.bz2 | tar xvf -
This set of commands will create the distribution directory public_mm in the current working directory (<parent_directory>). So you will have created <parent_directory>/public_mm.

To begin the initial install, go to the directory created when you extracted the distribution (public_mm).

% cd public_mm
You can speed up the process by telling the install program where your java installation is by setting the environment variable JAVA_HOME to the Java installation directory. If you don't set the variable the program will prompt you for the information.

To find out where your java installation is located, use the following command:

% which java
To set the environment variable JAVA_HOME, use the information from the which command. For example, if the command: which java returns /usr/local/jre1.4.2/bin/java, then JAVA_HOME should be set to /usr/local/jre1.4.2/.

# in C Shell (csh or tcsh)

setenv JAVA_HOME /usr/local/jre1.4.2

# in Bourne Again Shell (bash)

export JAVA_HOME=/usr/local/jre1.4.2

# Bourne Shell (sh)

JAVA_HOME=/usr/local/jre1.4.2
export JAVA_HOME
You also need to add the <parent dir>/public_mm/bin directory to your program path:

# in C Shell (csh or tcsh)

setenv PATH <parent dir>/public_mm/bin:$PATH

# in Bourne Again Shell (bash)

export PATH=<parent dir>/public_mm/bin:$PATH

# Bourne Shell (sh)

PATH=<parent dir>/public_mm/bin:$PATH
export PATH
Now you are ready to run the installation script:
% ./bin/install.sh
A successful installation should look similar to the following:
% cd <parent dir>/public_mm
% ./bin/install.sh
Enter basedir of installation [<parent dir>/public_mm] <user hits return to get the default>
Basedir is set to <parent dir>/public_mm.

The WSD Server requires Sun's Java Runtime Environment (JRE)
Sun's Java Developer Kit (JDK) will work as well. if the
command: "which" java returns /usr/local/jre1.4.2/bin/java, then the
JRE resides in /usr/local/jre1.4.2/.

Where does your distribution of Sun's JRE reside?
Enter home path of JRE (JDK) [/usr]: /nfsvol/nls/tools/Linux-i686/java1.4.2
Using /nfsvol/nls/tools/Linux-i686/java1.4.2 for JAVA_HOME.

<parent dir>/public_mm/WSD_Server/config/disambServer.cfg generated
<parent dir>/public_mm/WSD_Server/config/log4j.properties generated
<parent dir>/public_mm/bin/SKRrun generated.
<parent dir>/public_mm/bin/metamap07 generated.
<parent dir>/public_mm/bin/wsdserverctl generated.
<parent dir>/public_mm/bin/skrmedpostctl generated.
Install complete.

%

MetaMap requires the starting of one or two servers depending on how you plan to use MetaMap. The SKR/MedPost Part-of-Speech Tagger Server is required regardless of how you use MetaMap. The Word Sense Disambiguation (WSD) Server is optional and only needs to be started if you want/plan to use the WSD option (-y) with MetaMap. They can be started and stopped as follows. Both servers will automatically run in the background when started.

Starting the SKR/Medpost Part-of-Speech Tagger Server:

% ./bin/skrmedpostctl start
Starting the Word Sense Disambiguation (WSD) Server (optional):
% ./bin/wsdserverctl start
You can stop the each server by invoking the corresponding script with the stop parameter:

Stopping the SKR/Medpost Part-of-Speech Tagger Server:
% ./bin/skrmedpostctl stop
Stopping the Word Sense Disambiguation (WSD) Server:
% ./bin/wsdserverctl stop
You can determine if the server are running by the command:
% ps -ef | grep java
The output should look something like this:
11318 pts/4 S+ 0:00 grep java
21254 ? Sl 0:10 /usr/local/j2sdk1.4.2_06/bin/java -cp ... /MedPost-SKR/Tagger_server/lib/mps.jar taggerServer
21267 ? Sl 0:10 /usr/local/j2sdk1.4.2_06/bin/java -Xmx2g ... WSD_Server/lib/log4j-1.2.8.jar wsd.server.DisambiguatorServer
Once the servers have been started and verified, you can test your MetaMap installation by using the following command:
% echo "lung cancer" | ./bin/metamap07 -I
OR
% echo "lung cancer" | ./bin/metamap08 -I
You should see a result similar to the following:
MetaMap (2007)

Control options:
  tag_text
  no_acros_abbrs
  an_derivational_variants
  stop_large_n
  plain_syntax
  candidates
  semantic_types
  mappings
  best_mappings_only
  show_cuis
Initializing db_access (07)...
Berkeley DB databases (normal strict model) are open.
Static variants will come from table varsan.
Accessing lexicon <parent directory>/public_mm/lexicon/data/lexiconStatic2007.
Variant generation mode: static.
Initializing tagger on localhost...

Processing 00000000.tx.1: lung cancer

Phrase: "lung cancer"
Meta Candidates (8):
  1000 C0242379:Lung Cancer (Malignant neoplasm of lung) [Neoplastic Process]
  1000 C0684249:Lung Cancer (Carcinoma of lung) [Neoplastic Process]
  861 C0006826:Cancer (Malignant Neoplasms) [Neoplastic Process]
  861 C0024109:Lung [Body Part, Organ, or Organ Component]
  861 C0998265:Cancer (Cancer Genus) [Invertebrate]
  861 C1278908:Lung (Entire lung) [Body Part, Organ, or Organ Component]
  861 C1306459:Cancer (Primary malignant neoplasm) [Neoplastic Process]
  768 C0032285:Pneumonia [Disease or Syndrome]
Meta Mapping (1000):
  1000 C0684249:Lung Cancer (Carcinoma of lung) [Neoplastic Process]
Meta Mapping (1000):
  1000 C0242379:Lung Cancer (Malignant neoplasm of lung) [Neoplastic Process] 
If there are no errors starting the WSD and Tagger servers, and you had a successful test, then MetaMap can be run as follows:
% ./bin/metamap07
OR
% ./bin/metamap08
MetaMap has a plethora of options that are explained elsewhere (MetaMap 2008 Usage, or MetaMap 2007 Usage).

Binary Update Installation:

Before you update MetaMap, if you run the following command, you should receive an error. Afterwards, the command should work fine.

% echo "lung cancer" | ./bin/metamap08 -e MSH
Should produce an error similar to the following:
#### ERROR: skr_phrase failed on 00000000.tx.1 lung cancer
Move the downloaded file into the <parent_directory> where you already installed MetaMap - you should be able to see the public_mm directory.

Before you extract the binary distribution files, you should backup the affected files.

% cd public_mm/bin
% mkdir Backup
% cp metamap08.BINARY install.sh SKRrun.in Backup
% cd ../..
To extract the MetaMap binary distribution, use the following bunzip2 and tar commands substituting the appropriate name of the file you downloaded (e.g., public_mm_linux_binaries_2008v2.tar.bz2 or public_mm_solaris_binaries_2008v2.tar.bz2):
% bunzip2 -c public_mm_<platform>_binaries_<year>v2.tar.bz2 | tar xvf -
Now run the installation script to update the various files used by MetaMap. Answer the questions just like before.
% ./bin/install.sh
You can test your updated MetaMap installation by using the following command:
% echo "lung cancer" | ./bin/metamap08 -e MSH
You should see a result similar to the following:
MetaMap (2008)


Control options:
  mm_data_year=08
  exclude_sources=[MSH]
Berkeley DB databases (normal strict 08 model) are open.
Static variants will come from table varsan.
Variants: Adj/noun ONLY.
Accessing lexicon
<parent directory>/public_mm/lexicon/data/lexiconStatic2008.
Variant generation mode: static.

Established connection to Tagger Server on localhost.
Processing 00000000.tx.1: lung cancer

Phrase: "lung cancer"
Meta Candidates (8):
  1000 Lung Cancer (Malignant neoplasm of lung) [Neoplastic Process]
  1000 Lung Cancer (Carcinoma of lung) [Neoplastic Process]
   861 Cancer (Malignant Neoplasms) [Neoplastic Process]
   861 Lung [Body Part, Organ, or Organ Component]
   861 Cancer (Cancer Genus) [Invertebrate]
   861 Lung (Entire lung) [Body Part, Organ, or Organ Component]
   861 Cancer (Primary malignant neoplasm) [Neoplastic Process]
   768 Pneumonia [Disease or Syndrome]
Meta Mapping (1000):
  1000 Lung Cancer (Carcinoma of lung) [Neoplastic Process]
Meta Mapping (1000):
  1000 Lung Cancer (Malignant neoplasm of lung) [Neoplastic Process]

Un-Install:

Before un-installing MetaMap, make sure both of the MetaMap servers have been stopped (see Stopping the servers).

To un-install MetaMap move to the parent directory of your Metamap installation and run the uninstall program:
% cd <parent directory of installation>
% ./public_mm/bin/uninstall.sh
Do you really want to uninstall MetaMap? [no/yes] yes
Removing Tagger Server
Removing WSD Server
Removing Lexicon
Removing MetaMap Databases
Removing Programs
Removing Base Directory
Removal of MetaMap installation successful.
%

Using MetaMap:

For more information on running MetaMap and its many options, please see these references:

  1. MetaMap 2009 Release Notes (HTML)
  2. MetaMap 2009 Readme (HTML)
  3. MetaMap 2009 Usage (HTML)
  4. MetaMap 2008 Readme (HTML)
  5. MetaMap 2007 Readme (HTML)
  6. MetaMap 2008 Usage (HTML)
  7. MetaMap 2007 Usage (HTML)
  8. Effective Mapping of Biomedical Text to the UMLS Metathesaurus: The MetaMap Program, 2001  (PDF - 55 kb)
  9. MetaMap: Mapping Text to the UMLS Metathesaurus, July 2006  (PDF - 280 kb)
  10. MetaMap Options and Examples, September 2006  (PDF - 50 kb)