National Cancer Institute   U.S. National Institutes of Health www.cancer.gov
caBIG® Knowledge Center: A part of the Enterprise Support Network

LexGrid Resources

From ESNWiki

Jump to: navigation, search

Contents

What is the LexGrid?

LexGrid (Lexical Grid) provides support for a distributed network of lexical resources such as terminologies and ontologies via standards-based tools, storage formats, and access/update mechanisms.

The Lexical Grid Vision - a distributed network of terminological resources.

Why was LexGrid Developed?(ppt)

Currently, there are many terminologies and ontologies in existence. But just about every terminology has its own format, its own set of tools, and its own update mechanisms. The only thing that most of these pieces have in common with each other is their incompatibility. This makes it very hard to use these resources to their full potential. We have designed the Lexical Grid as a way to bridge terminologies and ontologies with a common set of tools, formats and update mechanisms. The Lexical Grid is:

LexGrid interlocking components
LexGrid interlocking components
  • accessible through a set of common API's
  • joined through shared indices
  • online accessible
  • downloadable
  • loosely coupled
  • locally extendable
  • globally revised
  • available in web-space on web-time
  • cross-linked

The realization of this vision requires three interlocking components, which are:

  • Standards - access methods and formats need to be published and openly available
  • Tools - standards based tools must be readily available
  • Content - commonly used terminologies have to be available for access and download

The following is an archived presentation on the Lexical Grid as presented by Harold Solbrig.

The Division of Biomedical Informatics, Mayo Clinic, has developed an infrastructure to gather the content of coding and classification schemes into an integrated framework called the Lexical Grid. This presentation will cover some of the motivations behind this project. It will describe the architecture and functionality that exists in the Lexical Grid today and will also touch on potential next steps and future applications.

Lexical Grid as presented by Harold Solbrig

October 22, 2008 Video presentation on LexGrid and LexBIG

Thomas M. Johnson, Harold R. Solbrig; Division of Biomedical Statistics and Informatics, Mayo Clinic.

Common Terminology Services

The Common Terminology Services (CTS) specification was developed as an alternative to a common data structure. We have published the HL7 CTS specification and a reference implementation of the specification.

The HL7 Common Terminology Services (HL7 CTS) is an Application Programming Interface (API) specification that is intended to describe the basic functionality that will be needed by HL7 Version 3 software implementations to query and access terminological content. It is specified as an API rather than a set of data structures to enable a wide variety of terminological content to be integrated within the HL7 Version 3 messaging framework without the need for significant migration or rewrite.

Instead of specifying what an external terminology must look like, HL7 has chosen to identify the common functional characteristics that an external terminology must be able to provide. As an example, an HL7 compliant terminology service will need to be able to determine whether a given concept code is valid within the particular resource. Instead of describing a table keyed by the resource identifier and concept code, the CTS specification describes an Application Programming Interface (API) call that takes a resource identifier and concept code as input and returns a true/false value. Each terminology developer is free to implement this API call in whatever way is most appropriate for them.

There are two layers between HL7 Version 3 message processing applications and the target vocabularies. The upper layer, the Message API communicates with in terms of vocabulary domains, realms, coded attributes and other artifacts of the RIM and HL7 messaging model. The lower layer, the Vocabulary API communicates in terms of coding system, concept codes, designations, and other vocabulary related entities.

Lexicon Query Services

Original work on the LQS evolved into the LexGrid project. The implementation of the LQS has fallen behind on changes that have made to the ldap database as LexGrid evolved, so there is no longer a download for it. In the future, there there is the hope to provide an implementation of the LQS here that runs on LexGrid databases.

UMLS

The UMLS is a distribution of terminologies by the NLM. They have a well documented format that you can download the terminologies in. The download for LexGrid provides tooling that allows you to import the UMLS formats directly into the LexGrid formats, so that you can use all of the tooling on any terminology from the UMLS.

Content and Storage

Many terminology storage formats, both standard and non-standard, are currently in use. The LexGrid Model and supporting software provide the ability to convert information from several common file formats, thereby making the content accessible through standardized API implementations and tooling.

Some formats are OWL(Web Ontology Language), RRF (Rich Release Format),OBO (Open Biomedical Ontologies)and LexGrid XML. More information on these formats can be found here.

LexGrid conversion software is capable of importing these supported vocabulary formats into relational databases (SQL-based access)

LexGrid based projects can now access an SQL database schema consistent with the LexGrid model. This database schema is currently available in two different versions, LexGrid SQL and SQL Lite. The "Lite" version is designed for small scale experimentation on an Access based database. The regular SQL schema can be fully scaled to large vocabularies on a variety of commercial and open source database management systems.

Data Models

The starting point for the model LexGrid Data Model was the Object Management Group (OMG) Terminology Query Services specification, with some revisions and many simplifications.

The OMG Model

Lexicon Query Service Specification from OMG A pdf file of the model.

Lexicon Query Service Specification IDL from OMG The idl of the model.


Supported Standards Many terminology standards have been created by various organizations.

CTS (Common Terminology Services) and LQS (Lexicon Query Services) are both API's for accessing content that is stored in the LexGrid Model.

UMLS (Unified Medical Language System) is a standard distribution format from the National Library of Medicine. RRF or Rich Release Format is an updated file format used in recent releases of the NCI MetaThesaurus.

Terminology Query Services specification, with some revisions and many simplifications. ***Please Note: The 2005 Model of LexGrid is no longer supported by LexBIG.***


Version 2008/01 of model and schema (revised - 04/09/2008)

Zip File of all parts of the 2008 LexGrid Data Model in XML Schema

Zip File of model in EA Format

The LexGrid Data Model in XML Schema

Version 2006/01 of model and schema (revised - 02/22/2006)

Zip File of all parts of the 2006 LexGrid Data Model in XML Schema

Zip File of 2006 LexGrid Data Model in Enterprise (EA) Format

The LexGrid Data Model in XML Schema

Documents

LexGrid Loader Mapping

LexGrid Ontology Loader Mapping LexGrid Vocabulary Services for caBIG® (LexBIG) Authors: Scott Bauer, Craig Stancl

LexGrid Source Mapping Guide

LexGrid Source Mapping Guide LexGrid Vocabulary Services for caBIG® (LexBIG) Version 1.0 Last Modified: September 26, 2008 Authors: Scott Bauer Craig Stancl

Links