Home | NLM » LHNCBC » MetaMap |
|
MetaMap is a highly configurable program developed at the National
Library of Medicine (NLM) to map biomedical text to the Metathesaurus
or, equivalently, to discover Metathesaurus concepts referred to in text.
MetaMap uses a knowledge intensive approach based on symbolic, natural
language processing (NLP) and computational linguistic techniques.
Besides being applied for both IR and data mining applications, MetaMap
is one of the foundations of NLM's Indexing Initiative System which is
being applied to both semiautomatic and fully automatic indexing of the
biomedical literature at the library.
Prerequisites:
PLEASE NOTE: The downloads are restricted and require a valid UMLSKS username and password! Please see the above list of prerequisites before attempting to download MetaMap. Currently, each of the downloads contains a binary version of MetaMap compiled specifically for either Linux or Solaris, and we have included the Strict Data Model for each of the years respectively. We plan to also include the Moderate and Relaxed Data Models for each year and we are working on updating the DataFileBuilder to work with MetaMap. These features will be phased in as they are completed and tested.
Move the downloaded file into a directory where you want to install MetaMap. This directory will then be referred to as <parent_directory> throughout the rest of the installation instructions. To extract the MetaMap distribution, use the following bunzip2 and tar commands substituting the appropriate name of the file you downloaded (e.g., public_mm_linux_2008.tar.bz2, public_mm_solaris_2008.tar.bz2, public_mm_linux_2007.tar.bz2, or public_mm_solaris_2007.tar.bz2):
% bunzip2 -c public_mm_<platform>_<year>.tar.bz2 | tar xvf -
This set of commands will create the distribution directory
public_mm in the current working directory
(<parent_directory>). So you will have created
<parent_directory>/public_mm.
To begin the initial install, go to the directory created when you extracted the distribution (public_mm). % cd public_mm
You can speed up the
process by telling the install program where your java installation is
by setting the environment variable JAVA_HOME to the Java installation
directory. If you don't set the variable the program will prompt you
for the information.
To find out where your java installation is located, use the following command: % which java
To set the environment variable JAVA_HOME, use the information from
the which command. For example, if the command:
which java returns /usr/local/jre1.4.2/bin/java, then
JAVA_HOME should be set to /usr/local/jre1.4.2/.
# in C Shell (csh or tcsh)
You also need to add the <parent dir>/public_mm/bin directory to
your program path:
setenv JAVA_HOME /usr/local/jre1.4.2 # in Bourne Again Shell (bash) export JAVA_HOME=/usr/local/jre1.4.2 # Bourne Shell (sh) JAVA_HOME=/usr/local/jre1.4.2 export JAVA_HOME
# in C Shell (csh or tcsh)
Now you are ready to run the installation script:
setenv PATH <parent dir>/public_mm/bin:$PATH # in Bourne Again Shell (bash) export PATH=<parent dir>/public_mm/bin:$PATH # Bourne Shell (sh) PATH=<parent dir>/public_mm/bin:$PATH export PATH
% ./bin/install.sh
A successful installation should look similar to the following:
% cd <parent dir>/public_mm
MetaMap requires the starting of one or two servers depending on how you
plan to use MetaMap. The SKR/MedPost Part-of-Speech Tagger Server is
required regardless of how you use MetaMap. The Word Sense
Disambiguation (WSD) Server is optional and only needs to be started if
you want/plan to use the WSD option (-y) with MetaMap. They can be
started and stopped as follows. Both servers will automatically run in
the background when started.% ./bin/install.sh Enter basedir of installation [<parent dir>/public_mm] <user hits return to get the default> Basedir is set to <parent dir>/public_mm. The WSD Server requires Sun's Java Runtime Environment (JRE) Sun's Java Developer Kit (JDK) will work as well. if the command: "which" java returns /usr/local/jre1.4.2/bin/java, then the JRE resides in /usr/local/jre1.4.2/. Where does your distribution of Sun's JRE reside? Enter home path of JRE (JDK) [/usr]: /nfsvol/nls/tools/Linux-i686/java1.4.2 Using /nfsvol/nls/tools/Linux-i686/java1.4.2 for JAVA_HOME. <parent dir>/public_mm/WSD_Server/config/disambServer.cfg generated <parent dir>/public_mm/WSD_Server/config/log4j.properties generated <parent dir>/public_mm/bin/SKRrun generated. <parent dir>/public_mm/bin/metamap07 generated. <parent dir>/public_mm/bin/wsdserverctl generated. <parent dir>/public_mm/bin/skrmedpostctl generated. Install complete. % Starting the SKR/Medpost Part-of-Speech Tagger Server: % ./bin/skrmedpostctl start
Starting the Word Sense Disambiguation (WSD) Server (optional):% ./bin/wsdserverctl start
You can stop the each server by invoking the corresponding script
with the stop parameter:Stopping the SKR/Medpost Part-of-Speech Tagger Server: % ./bin/skrmedpostctl stop
Stopping the Word Sense Disambiguation (WSD) Server:% ./bin/wsdserverctl stop
You can determine if the server are running by the command:% ps ax | grep java
The output should look something like this:
11318 pts/4 S+ 0:00 grep java
Once the servers have been started and verified, you can test your
MetaMap installation by using the following command:
21254 ? Sl 0:10 /usr/local/j2sdk1.4.2_06/bin/java -cp ... /MedPost-SKR/Tagger_server/lib/mps.jar taggerServer 21267 ? Sl 0:10 /usr/local/j2sdk1.4.2_06/bin/java -Xmx2g ... WSD_Server/lib/log4j-1.2.8.jar wsd.server.DisambiguatorServer
% echo "lung cancer" | ./bin/metamap07 -I
You should see a result similar to the following:
OR % echo "lung cancer" | ./bin/metamap08 -I
MetaMap (2007)
If there are no errors starting the WSD and Tagger servers, and you had
a successful test, then MetaMap can be run as follows:
Control options: tag_text no_acros_abbrs an_derivational_variants stop_large_n plain_syntax candidates semantic_types mappings best_mappings_only show_cuis Initializing db_access (07)... Berkeley DB databases (normal strict model) are open. Static variants will come from table varsan. Accessing lexicon <parent directory>/public_mm/lexicon/data/lexiconStatic2007. Variant generation mode: static. Initializing tagger on localhost... Processing 00000000.tx.1: lung cancer Phrase: "lung cancer" Meta Candidates (8): 1000 C0242379:Lung Cancer (Malignant neoplasm of lung) [Neoplastic Process] 1000 C0684249:Lung Cancer (Carcinoma of lung) [Neoplastic Process] 861 C0006826:Cancer (Malignant Neoplasms) [Neoplastic Process] 861 C0024109:Lung [Body Part, Organ, or Organ Component] 861 C0998265:Cancer (Cancer Genus) [Invertebrate] 861 C1278908:Lung (Entire lung) [Body Part, Organ, or Organ Component] 861 C1306459:Cancer (Primary malignant neoplasm) [Neoplastic Process] 768 C0032285:Pneumonia [Disease or Syndrome] Meta Mapping (1000): 1000 C0684249:Lung Cancer (Carcinoma of lung) [Neoplastic Process] Meta Mapping (1000): 1000 C0242379:Lung Cancer (Malignant neoplasm of lung) [Neoplastic Process]
% ./bin/metamap07
MetaMap has a plethora of options that are explained elsewhere
(MetaMap 2008
Usage, or MetaMap
2007 Usage).OR % ./bin/metamap08 Un-Install: Before un-installing MetaMap, make sure both of the MetaMap servers have been stopped (see Stopping the servers). To un-install MetaMap move to the parent directory of your Metamap installation and run the uninstall program:
% cd <parent directory of installation>
Using MetaMap:% ./public_mm/bin/uninstall.sh Do you really want to uninstall MetaMap? [no/yes] yes Removing Tagger Server Removing WSD Server Removing Lexicon Removing MetaMap Databases Removing Programs Removing Base Directory Removal of MetaMap installation successful. % For more information on running MetaMap and its many options, please see these references:
|
Last Modified: November 04, 2008 | ii-public | |||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||
Lister Hill National Center for Biomedical Communications | U.S. National Library of Medicine | National Institutes of Health | ||||||||||||||||||||||||||
Department of Health and Human Services | ||||||||||||||||||||||||||||
|