Skip to Content
United States National Library of Medicine National Institutes of Health

Licensee Research Use of MEDLINE®/PubMed® Data:
Summary Report Sorted by Research Category:
Development of Information Extraction and/or Retrieval Methods

This report is sorted alphabetically by organization/institution. It lists NLM's licensees who have submitted information in the category shown below about their use of the data for research and permitted NLM to make their information available on the Web.

Click on the Project Summary for the corresponding complete report. General information about NLM's Web site for licensees' research projects is available.



DEVELOPMENT OF INFORMATION EXTRACTION AND/OR RETRIEVAL METHODS

1   Advanced Health Media
2840 Morris Ave, Union, NJ 07083
Eric Johnson

Project Name: IM2 KOL Identification
Project Summary: Currently, InsiteResearch uses the platform to retrieve data on therapeutic class experts to act as investigators or speakers for healthcare companies

2   Arity Corporation
Research & Development
Peter Gabel
peter.gabel@arity.com
Pamela Schaepe
pamela.schaepe@arity.com
Project Name: NLP Information Extraction for Biomedical Research &
Development
Project Summary: Goal of the project is to utilize sophisticated Natural Lanugage Processing techniques to maximize accuracy in extractio of facts, associations, assertions and rules from biomedical literature.

3   ATA SpA - Advanced Technology Assessment

Massimo Riccaboni
info@atalab.com

Project Name:
Project Summary: The main goals of this project are to set up automated or semi-automated procedures to translate affiliation information provided by PubMed into structured data, with a particular emphasis on separating and identifying geographical information, and to develop suitable approaches to analysis of un-structured texts in the bio- medical domain.

4   Canaledge Inc.

Yoshiyuki Kobayashi
yashi@canaledge.com
Takao Asanuma
asanuma@canaledge.com
Project Name: text-mining system development
Project Summary: Development of a text-mining system for finding the data of protein/protein interactions or gene/disease or gene/chemical compound relationships, etc.

5   Carnegie Mellon University

Eric Nyberg

Project Name:
Project Summary: Research in the use of language processing technology to provide more intelligent access to information in MEDLINE abstracts

6   Children's Hospital of Philadelphia

Peter White
white@genome.chop.edu

Project Name: Mining the bibliome: Information extraction of the biomedical
literature
Project Summary: Our goal is qualitatively better methods for automatically extracting information from the biomedical literature, relying on recent progress and new research in three areas: high-accuracy parsing, shallow semantic analysis, and integration of large volumes of diverse data.

7   Columbia University

Stephen Johnson
sbj2@columbia.ed

Project Name: CIQR
Project Summary: CQIR (seeker) explores Context-Initiated Question and Response: responding to physicians' questions when they are in the context of patient care.

8   David Calloway

David Calloway
calloway@novatechnologies.net

Project Name: wikipdf
Project Summary: The wikipdf tool (http://www.wikipdf.com) automatically generates glossaries of unusual terms for any article identified by a Medline citation.

9   EMBL-European Bioinformatics Institute
Rebholz Group
Peter Stoehr

Project Name: Whatizit
Project Summary: We focus on extraction of facts from scientific literature in molecular biology. This is mainly based but not limited to Pattern Matching and other High-Throughput methods. The group has experience in chunk parsing, natural language processing (NLP), and has applied its methods to different tasks. This includes identification of terminology, of abbreviations, of mutations and of relations between named entities, e.g. protein-protein interactions.

10   Fujitsu Limited
Makuhari Systems Laboratory
Shuhei Kinoshita
kino@strad.ssg.fujitsu.com
Masato Mori
masatom@strad.ssg.fujitsu.com
Project Name: Bio Chemical Information Project
Project Summary: Development of NLP programs which extracts Protein-Protein or Protein-Compound relationships.

11   Marquette University

Craig Struble
craig.struble@marquette.edu

Project Name:
Project Summary: We are using MEDLINE to develop information extraction tools for experimental techniques and protein kinase inhibitors.

12   NAIST

Kouichi Doi
doy@is.naist.jp

Project Name:
Project Summary: We research automatic extraction of protein protein interaction. Our sub goals are named entity of proteins, classification of verb or verb phrase and extraction of abbrevation.

13   Nara Institute of Science and Technology

Yuji Matsumoto
Masashi Shimbo
Project Name:
Project Summary: We apply statistical natural language processing techniques to the information extraction and retrieval from Medline abstracts.

14   National Cheng-Kung University
Department of Computer Science and Information Engineering
Wen-Hsiang Lu
whlu@mail.ncku.edu.tw

Project Name: MMODE: Cross-Language Medical Information Retrieval for
Consumers
Project Summary: Many consumers in non-English-speaking countries are eager to access up-to-date health information from the U.S. authoritative medical websites, such as PubMed and MedlinePlus. However, currently, there is no any cross-language medical information retrieval (CLMIR) system that could provide Taiwanese consumers to overcome the language barrier.

15   OmniViz, Inc.

Jeffrey Saffer

Project Name: Development of Visualization Software
Project Summary: Development of software capable of visualizing Medline data.

16   Polish Academy of Sciences
Institute of Biochemistry & Biophysics
Pawel Siedlecki

Project Name:
Project Summary: Usage of medline database for searching information about aminoacid sequences

17   Public Health Genetics Unit, Cambridge

Julian Higgins
julian.higgins@mrc-bsu.cam.ac.uk
Roger Hale
roger.hale@linguamatics.com
Project Name:
Project Summary: Investigation of I2E text-mining system in human genome epidemiology

18   Spanish National Biotechnology Center (CNB-CSIC)
Protein Design Group
Martin Krallinger
martink@cnb.uamm.es

martingenetech@yahoo.com
Project Name: NATURAL LANGUAGE PROCESSING STRATEGIES
Project Summary: This work explores basic aspects of text mining and NLP strategies for providing functional information for protein s and genes in the context of functional descriptions of genes such as using protein - GGene Ontology term associations

19   SUNY Stony Brook
Dept. of Computer Science
Steven Skiena
skiena@cs.sunysb.edu

Project Name:
Project Summary: TextMed is a search engine for medical entities: diseases, drugs, chemicals, organs and organisms.TextMed aims to identify relationships between these medical entities. TextMed uses natural language processing techniques to track medical entity references from the scientific literature, and a variety of statistical techniques to analyze the relationships between them.

20   U TX-Houston
School of Health Information Sciences
Elmer Bernstam
Elmer.V.Bernstam@uth.tmc.edu

Project Name: MedlineQBE
Project Summary: The goals of this project are to facilitate access to the biomedical literature by using techniques adapted from the World Wide Web. We are currently exploring citation analysis and collaborative filtering. In addition, we are exploring novel evaluation methods to compare alternative retrieval strategies.

21   University Health Network
Jurisica Lab
David Otasek

Project Name:
Project Summary: Automated extraction of protein-protein interactions from Medline/PubMed abstracts.

22   University of Texas at Austin
Center for Computational Biology and Bioinformatics
Edward Marcotte

Project Name:
Project Summary: The focus of our lab is in the study of protein functions and protein-protein interactions by combining computational and bioinformatic approaches with experimental techniques

23   Wageningen University and Research Centre
Laboratory of Bioinformatics
Jack Leunissen
jack.leunissen@wur.nl

Project Name: BIOMETA
Project Summary: Development of text mining techniques, including concept weighting and term disambiguation

Last updated: 22 August 2007
First published: 02 August 2005
Metadata| Permanence level: Permanence Not Guaranteed