Skip to Content

Rich Release Format MySQL Load Script

This document is a guide to loading an RRF subset of the UMLS Metathesaurus into a MySQL database by using the scripts and files created by MetamorphoSys. To generate, the necessary files and scripts, select the "Write MySQL Load Script" checkbox on the "Output Options" tab.


Recommended Software Version

MySQL Server 5.0


Creating a Database

In order to load your subset into a MySQL database, you will have to either use an existing database or create a new one. Two important considerations when creating a MySQL database are the default character set and collation settings.  We recommend using UTF8 and the utf8_unicode_ci collation setting. For example,

CREATE DATABASE IF NOT EXISTS umls CHARACTER SET utf8 COLLATE utf8_unicode_ci;

Configuring Parameters for Performance

To optimize for read performance, the MySQL 5 server requires changing buffer sizes to make use of the memory available. Each parameter is subject to the amount of memory on each server and should be sized appropriately in the my.cnf or my.ini file


Configuring and Running the Load Script

Windows

If you are running MetamorphoSys in a Windows environment, in your output directory (after running MetamorphoSys with the "Write MySQL Load Script" box checked) you should see a batch file script called "populate_mysql_db.bat".  This script can be used to load your subset into a MySQL database.   The directory will also contain an SQL script for creating the tables, and an SQL script for creating the indexes..

Edit the script and configure the four parameters at the top of the script.  These are:

set MYSQL_HOME=<path to MYSQL_HOME>
set user=<username>
set password=<password>
set db_name=<db_name>

MYSQL_HOME should be set to the installation directory of your local MySQL server.  This is likely to be something like "C:\Program Files\MySQL\MySQL Server 5.0".  Remember to enclose the value in quotes if the directory name contains spaces.   After configuring these four parameters, this section of your script may look something like the following.

set MYSQL_HOME="C:\Program Files\MySQL\MySQL Server 5.0"
set user=myuser
set password=p4ssw0rd
set db_name=umls

You are now ready to run the script.  To do so, navigate to your subset directory using the Windows explorer and double-click on the "populate_mysql_db.bat" file you just finished editing.

Here are some things to keep in mind when using this script:

  1. The script is written to expect that there is a local MySQL Server installation and that you plan to load your UMLS Metathesaurus subset into a database being managed by this local server.   Before getting this far, you should have already installed and configured your MySQL Server.   For more information on that, you are encouraged to visit http://www.mysql.com.
  2. If your database is configured without a password you can leave the password setting blank, but you will have to update the script "populate_mysql_db.bat" and remove all references to the "-p%password%" parameter.  Thus, a line like this:

    %MYSQL_HOME%\bin\mysql -u %user% -p%password% --local-infile=1 %db_name% < mysql_tables.sql >> mysql.log 2>&1

    Must be changed to this

    %MYSQL_HOME%\bin\mysql -u %user% --local-infile=1 %db_name% < mysql_tables.sql >> mysql.log 2>&1

  3. The script will automatically create a log file called "mysql.log" in the same directory as the script.  If you are planning on running the script more than once, you should consider renaming the log file before your second run, otherwise the contents of the first run will be lost.
  4. While the script is running, you can monitor the process from within the MySQL command shell by using "mysql> show processlist;"
  5. The load may take many hours (especially if you have selected a large subset). When it completes, each of the tables will be loaded, and an initial set of indexes will have been created.


Linux, Macintosh, or Solaris

If you are running MetamorphoSys in a Linux, Macintosh, or Solaris environment, in your output directory (after running MetamorphoSys with the "Write MySQL Load Script" box checked) you should see a .sh shell script called "populate_mysql_db.sh".  This script can be used to load your subset into a MySQL database.   The directory will also contain an SQL script for creating the tables, and an SQL script for creating the indexes..

Edit the script and configure the four parameters at the top of the script.  These are:

MYSQL_HOME=<path to MYSQL_HOME>
user=<username>
password=<password>
db_name=<db_name>

MYSQL_HOME should be set to the installation directory of your local MySQL server.  Another way to think of this is that the setting of MYSQL_HOME should contain a bin/ directory that contains the "mysql" command.  This may even be something like "/usr".  After configuring these four parameters, this section of your script may look something like the following.

MYSQL_HOME=/usr
user=myuser
password=p4ssw0rd
db_name=umls

You are now ready to run the script.  To do so, navigate to your subset directory and invoke the script from the command line.  It may be necessary to change the file permissions to make the script executable before it can be started. For example:

% cd <subset directory>
% chmod 775 populate_mysql_db.sh
% populate_mysql_db.sh &

Here are some things to keep in mind when using this script:

  1. The script is written to expect that there is a local MySQL Server installation and that you plan to load your UMLS Metathesaurus subset into a database being managed by this local server.   Before getting this far, you should have already installed and configured your MySQL Server.   For more information on that, you are encouraged to visit http://www.mysql.com.
  2. If your database is configured without a password you can leave the password setting blank, but you will have to update the script "populate_mysql_db.bat" and remove all references to the "-p%password%" parameter.  Thus, a line like this:

    $MYSQL_HOME/bin/mysql -u $user -p$password $db_name < mysql_tables.sql >> mysql.log 2>&1

    Must be changed to this

    $MYSQL_HOME/bin/mysql -u $user $db_name < mysql_tables.sql >> mysql.log 2>&1

  3. The script will automatically create a log file called "mysql.log" in the same directory as the script.  If you are planning on running the script more than once, you should consider renaming the log file before your second run, otherwise the contents of the first run will be lost.
  4. While the script is running, you can monitor the process from within the MySQL command shell by using "mysql> show processlist;"
  5. The load may take many hours (especially if you have selected a large subset). When it completes, each of the tables will be loaded, and an initial set of indexes will have been created.


Sample Queries

We have provided a number of sample queries to demonstrate how to access certain kinds of data.

Get the English preferred name for a CUI.

SELECT str FROM MRCONSO
WHERE cui = ? AND stt = 'PF'
     AND ts = 'P' AND ispref = 'Y' AND lat = 'ENG';

Get MRCONSO rows (atoms) for a SAB and CODE, in order of precedence

SELECT * FROM MRCONSO
WHERE sab = ? AND code = ?
ORDER BY cui, lat, lower(ts), lui, stt, sui, ispref desc;

Get all MRCONSO rows (atoms) for a CUI, in order of precedence.

SELECT * FROM MRCONSO
WHERE cui = ?
ORDER BY cui, lat, lower(ts), lui, stt, sui, ispref desc;

Get all MRCONSO rows (atoms), with STYs, for a CUI, in order of precedence.

SELECT a.*, sty FROM MRCONSO a, MRSTY b
WHERE a.cui = ? AND a.cui = b.cui
ORDER BY cui, lat, lower(ts), lui, stt, sui, ispref desc;

Get all MRCONSO rows (atoms)  for a particular STY, in order of precedence by CUI.

SELECT a.* FROM MRCONSO a, MRSTY b
WHERE sty = ? AND a.cui = b.cui
ORDER BY cui, lat, lower(ts), lui, stt, sui, ispref desc;

Perform a norm string lookup in MRXNS_ENG to get MRCONSO rows.

SELECT a.* FROM MRCONSO a, MRXNS_ENG b
WHERE a.cui = b.cui and nstr = ? and a.sui = b.sui;

Find all Semantic Types for a particular MRCONSO string.

SELECT sty FROM MRCONSO a, MRSTY b
WHERE a.cui = b.cui AND str = ?;

Find all contexts for an atom.

SELECT * FROM MRHIER
WHERE cui = ? AND aui = ?
ORDER BY cui, aui, sab, cxn;

Find all definitions for all concepts containing a particular string.

SELECT def FROM MRCONSO a, MRDEF b
WHERE a.cui = b.cui AND str = ?;

Find all definitions for a particular string

SELECT def FROM MRCONSO a, MRDEF b
WHERE a.aui = b.aui AND str = ?;

Find preferred MUI of a MeSH D#.

SELECT scui FROM MRCONSO
WHERE sab = 'MSH' AND tty = 'MH'
     AND sdui = ?;

Last reviewed: 05 July 2007
Last updated: 05 July 2007
First published: 05 July 2007
Metadata| Permanence level: Permanent: Dynamic Content