MetaMap HomePage Label

 Home    NLM » LHNCBC » MetaMap

Home
Announcement (HTML)

Prerequisites

Downloads

Installation

Un-Install

Using MetaMap

MetaMap 2008
Release Notes (HTML)
Readme (HTML)
Usage (HTML)

MetaMap 2007
Readme (HTML)
Usage (HTML)



Usage Statistics
      MetaMap is a highly configurable program developed at the National Library of Medicine (NLM) to map biomedical text to the Metathesaurus or, equivalently, to discover Metathesaurus concepts referred to in text. MetaMap uses a knowledge intensive approach based on symbolic, natural language processing (NLP) and computational linguistic techniques. Besides being applied for both IR and data mining applications, MetaMap is one of the foundations of NLM's Indexing Initiative System which is being applied to both semiautomatic and fully automatic indexing of the biomedical literature at the library.

Prerequisites:
  • MetaMap requires a minimum of 7GB of disk space when it has been uncompressed.

  • MetaMap requires a minimum of 1GB of memory to run. At least 2GB is recommended.

  • You will need a working version of bunzip2 to uncompress the MetaMap download file. If you do not have a copy of bunzip2, it is available from http://www.bzip.org/.

  • To run MetaMap, you will need the Sun Java Runtime Environment (JRE). We have tested MetaMap with Sun's JRE 1.4.2, 1.5, and 1.6.0_07-b06. The JRE is required to run the SKR/MedPost Part-of-Speech (POS) Tagger Server which is required for MetaMap to run properly. The JRE is also used for the Word Sense Disambiguation (WSD) Server which must be running if you intend on using the WSD option (-y) when running MetaMap. The Sun JRE is available from: http://java.sun.com

  • To use MetaMap, you must comply with the MetaMap Terms and Conditions.

  • To download MetaMap, you must have access to a UMLSKS account. For more information about how we use UMLSKS authentication data, or for information on how to acquire a UMLSKS account, please see the UMLSKS Account Information Page.

  • To use MetaMap, you must have signed the UMLS agreement. The UMLS agreement requires those who use the UMLS to file a brief report once a year to summarize their use of the UMLS. It also requires the acknowledgment that the UMLS contains copyrighted material and that those copyright restrictions be respected. The UMLS agreement requires users to agree to obtain agreements for EACH copyrighted source prior to it's use within a commercial or production application - Use of all the sources is permitted if the application is used for research purposes only.
Downloads:

PLEASE NOTE: The downloads are restricted and require a valid UMLSKS username and password! Please see the above list of prerequisites before attempting to download MetaMap.

Currently, each of the downloads contains a binary version of MetaMap compiled specifically for either Linux or Solaris, and we have included the Strict Data Model for each of the years respectively. We plan to also include the Moderate and Relaxed Data Models for each year and we are working on updating the DataFileBuilder to work with MetaMap. These features will be phased in as they are completed and tested.
Installation:

Move the downloaded file into a directory where you want to install MetaMap. This directory will then be referred to as <parent_directory> throughout the rest of the installation instructions.

To extract the MetaMap distribution, use the following bunzip2 and tar commands substituting the appropriate name of the file you downloaded (e.g., public_mm_linux_2008.tar.bz2, public_mm_solaris_2008.tar.bz2, public_mm_linux_2007.tar.bz2, or public_mm_solaris_2007.tar.bz2):
% bunzip2 -c public_mm_<platform>_<year>.tar.bz2 | tar xvf -
This set of commands will create the distribution directory public_mm in the current working directory (<parent_directory>). So you will have created <parent_directory>/public_mm.

To begin the initial install, go to the directory created when you extracted the distribution (public_mm).
% cd public_mm
You can speed up the process by telling the install program where your java installation is by setting the environment variable JAVA_HOME to the Java installation directory. If you don't set the variable the program will prompt you for the information.

To find out where your java installation is located, use the following command:
% which java
To set the environment variable JAVA_HOME, use the information from the which command. For example, if the command: which java returns /usr/local/jre1.4.2/bin/java, then JAVA_HOME should be set to /usr/local/jre1.4.2/.
# in C Shell (csh or tcsh)
setenv JAVA_HOME /usr/local/jre1.4.2

# in Bourne Again Shell (bash)
export JAVA_HOME=/usr/local/jre1.4.2

# Bourne Shell (sh)
JAVA_HOME=/usr/local/jre1.4.2
export JAVA_HOME
You also need to add the <parent dir>/public_mm/bin directory to your program path:
# in C Shell (csh or tcsh)
setenv PATH <parent dir>/public_mm/bin:$PATH

# in Bourne Again Shell (bash)
export PATH=<parent dir>/public_mm/bin:$PATH

# Bourne Shell (sh)
PATH=<parent dir>/public_mm/bin:$PATH
export PATH
Now you are ready to run the installation script:
% ./bin/install.sh
A successful installation should look similar to the following:
% cd <parent dir>/public_mm
% ./bin/install.sh
Enter basedir of installation [<parent dir>/public_mm] <user hits return to get the default>
Basedir is set to <parent dir>/public_mm.

The WSD Server requires Sun's Java Runtime Environment (JRE)
Sun's Java Developer Kit (JDK) will work as well. if the
command: "which" java returns /usr/local/jre1.4.2/bin/java, then the
JRE resides in /usr/local/jre1.4.2/.

Where does your distribution of Sun's JRE reside?
Enter home path of JRE (JDK) [/usr]: /nfsvol/nls/tools/Linux-i686/java1.4.2
Using /nfsvol/nls/tools/Linux-i686/java1.4.2 for JAVA_HOME.

<parent dir>/public_mm/WSD_Server/config/disambServer.cfg generated
<parent dir>/public_mm/WSD_Server/config/log4j.properties generated
<parent dir>/public_mm/bin/SKRrun generated.
<parent dir>/public_mm/bin/metamap07 generated.
<parent dir>/public_mm/bin/wsdserverctl generated.
<parent dir>/public_mm/bin/skrmedpostctl generated.
Install complete.

%
MetaMap requires the starting of one or two servers depending on how you plan to use MetaMap. The SKR/MedPost Part-of-Speech Tagger Server is required regardless of how you use MetaMap. The Word Sense Disambiguation (WSD) Server is optional and only needs to be started if you want/plan to use the WSD option (-y) with MetaMap. They can be started and stopped as follows. Both servers will automatically run in the background when started.

Starting the SKR/Medpost Part-of-Speech Tagger Server:
% ./bin/skrmedpostctl start
Starting the Word Sense Disambiguation (WSD) Server (optional):
% ./bin/wsdserverctl start
You can stop the each server by invoking the corresponding script with the stop parameter:

Stopping the SKR/Medpost Part-of-Speech Tagger Server:
% ./bin/skrmedpostctl stop
Stopping the Word Sense Disambiguation (WSD) Server:
% ./bin/wsdserverctl stop
You can determine if the server are running by the command:
% ps ax | grep java
The output should look something like this:
11318 pts/4 S+ 0:00 grep java
21254 ? Sl 0:10 /usr/local/j2sdk1.4.2_06/bin/java -cp ... /MedPost-SKR/Tagger_server/lib/mps.jar taggerServer
21267 ? Sl 0:10 /usr/local/j2sdk1.4.2_06/bin/java -Xmx2g ... WSD_Server/lib/log4j-1.2.8.jar wsd.server.DisambiguatorServer
Once the servers have been started and verified, you can test your MetaMap installation by using the following command:
% echo "lung cancer" | ./bin/metamap07 -I
OR
% echo "lung cancer" | ./bin/metamap08 -I
You should see a result similar to the following:
MetaMap (2007)


Control options:
  tag_text
  no_acros_abbrs
  an_derivational_variants
  stop_large_n
  plain_syntax
  candidates
  semantic_types
  mappings
  best_mappings_only
  show_cuis
Initializing db_access (07)...
Berkeley DB databases (normal strict model) are open.
Static variants will come from table varsan.
Accessing lexicon <parent directory>/public_mm/lexicon/data/lexiconStatic2007.
Variant generation mode: static.
Initializing tagger on localhost...



Processing 00000000.tx.1: lung cancer

Phrase: "lung cancer"
Meta Candidates (8):
  1000 C0242379:Lung Cancer (Malignant neoplasm of lung) [Neoplastic Process]
  1000 C0684249:Lung Cancer (Carcinoma of lung) [Neoplastic Process]
  861 C0006826:Cancer (Malignant Neoplasms) [Neoplastic Process]
  861 C0024109:Lung [Body Part, Organ, or Organ Component]
  861 C0998265:Cancer (Cancer Genus) [Invertebrate]
  861 C1278908:Lung (Entire lung) [Body Part, Organ, or Organ Component]
  861 C1306459:Cancer (Primary malignant neoplasm) [Neoplastic Process]
  768 C0032285:Pneumonia [Disease or Syndrome]
Meta Mapping (1000):
  1000 C0684249:Lung Cancer (Carcinoma of lung) [Neoplastic Process]
Meta Mapping (1000):
  1000 C0242379:Lung Cancer (Malignant neoplasm of lung) [Neoplastic Process]
If there are no errors starting the WSD and Tagger servers, and you had a successful test, then MetaMap can be run as follows:
% ./bin/metamap07
OR
% ./bin/metamap08
MetaMap has a plethora of options that are explained elsewhere (MetaMap 2008 Usage, or MetaMap 2007 Usage).

Un-Install:

Before un-installing MetaMap, make sure both of the MetaMap servers have been stopped (see Stopping the servers).

To un-install MetaMap move to the parent directory of your Metamap installation and run the uninstall program:
% cd <parent directory of installation>
% ./public_mm/bin/uninstall.sh
Do you really want to uninstall MetaMap? [no/yes] yes
Removing Tagger Server
Removing WSD Server
Removing Lexicon
Removing MetaMap Databases
Removing Programs
Removing Base Directory
Removal of MetaMap installation successful.
%
Using MetaMap:

For more information on running MetaMap and its many options, please see these references:
  1.   MetaMap 2008 Readme (HTML)
  2.   MetaMap 2007 Readme (HTML)
  3.   MetaMap 2008 Usage (HTML)
  4.   MetaMap 2007 Usage (HTML)
  5.   Effective Mapping of Biomedical Text to the UMLS Metathesaurus: The MetaMap Program, 2001  (PDF - 55 kb)
  6.   MetaMap: Mapping Text to the UMLS Metathesaurus, July 2006  (PDF - 280 kb)
  7.   MetaMap Options and Examples, September 2006  (PDF - 50 kb)



Last Modified: November 04, 2008 ii-public
Links to Our Sites
MetaMap Public Release
NEW: Distributable version of the actual MetaMap program.
Indexing Initiative (II)
Investigating computer-assisted and fully automatic methodologies for indexing biomedical text. Includes the NLM Medical Text Indexer (MTI).
Semantic Knowledge Representation (SKR)
Develop programs to provide usable semantic representation of biomedical text. Includes the MetaMap and SemRep programs.
MetaMap Transfer (MMTx)
Java-Based distributable version of the MetaMap program.
Word Sense Disambiguation (WSD)
Test collection of manually curated MetaMap ambiguity resolution in support of word sense disambiguation research.
Medline Baseline Repository (MBR)
Static MEDLINE Baselines for use in research involving biomedical citations. Allows for query searches and test collection creation.
Lister Hill Center Homepage Link - Image of Lister Hill Center Lister Hill National Center for Biomedical Communications   NLM Homepage Link - NLM Logo U.S. National Library of Medicine   NIH Homepage Link - NIH Logo National Institutes of Health
DHHS Homepage Link - DHHS Logo Department of Health and Human Services
     Contact Us    |   Copyright    |   Privacy    |   Accessibility    |   Freedom of Information Act    |   USA.gov    Get Acrobat Reader button