NCBI Structure Group Home Page

Molecular Modeling Database banner graphic

Structure Home

3D Macromolecular Structures

Conserved Domains

PubChem

BioSystems

Search for

Help

The NCBI Structure Group

Resources for the Scientific Community

The resources developed by the Structure Group of the NCBI Computational Biology Branch (CBB) are freely available to the public and focus on four areas:

Macromolecular structures

The three-dimensional structures of biomolecules provide a wealth of information on their biological function and evolutionary relationships. The Molecular Modeling Database (MMDB), as part of the Entrez system, facilitates access to structure data by connecting them with associated literature, protein and nucleic acid sequences, chemicals, and more. It is possible, for example, to find 3D structures for homologs of a protein of interest by following the "Related Structure" link in an Entrez Protein sequence record (illustrated example). The 3D structures can be visualized and their sequence-structure relationships examined interactively. In addition, geometrically similar structures can be retrieved and superposed, making it possible to identify distant homologs that cannot be recognized by sequence comparison. In this way, the knowledge derived from 3D structures, which are currently available for selected protein family representatives only, may be extended to other family members.

Resources	MMDB	3D Domains	VAST	Cn3D	CBLAST
Search	How To	Publications	FTP

Thumbnail image of domain hierarchy showing divergence in a protein family based on phylogenetic relationships of protein sequences and functional properties. Click on the image for more information about the tools in the Conserved Domains resource group.

Conserved domains and protein classification

Conserved domains are functional units within a protein that have been used as building blocks in molecular evolution and recombined in various arrangements to make proteins with different functions. The Conserved Domain Database (CDD) brings together several collections of multiple sequence alignments representing conserved domains, in addition to NCBI-curated domains that use 3D-structure information explicitly to define domain boundaries and provide insights into sequence/structure/function relationships. The data are then used for putative functional annotation of protein query sequences based on matches to specific hits (illustrated example) or superfamilies, identification of proteins with similar domain architectures, and protein classification.

Resources	CDD	CD-Search	CDTree	CDART
Search	How To	Publications	FTP	News

Thumbnail image of gleevec, compound ID (CID) 5291. Click on image to open the home page for the PubChem resource group.

Small molecules and their biological activity

The PubChem project provides information on the biological activities of small molecules and is a component of NIH's Molecular Libraries Roadmap Initiative. PubChem includes three databases: PCSubstance, PCBioAssay, and PCCompound. The first two are archives of data submitted by the scientific community about chemical substances, including medicines, and their biological activity. The third is a derived, non-redundant database of compounds that constitute the substances in PCSubstance. The PubChem data are linked to other data types (illustrated example) in the Entrez system, making it possible, for example, to retrieve information about a compound and then "Link" to its biological activity data, retrieve 3D protein structures bound to the compound and interactively view their active sites, and find biosystems that include the compound as a component.

PubChem Home	PCCompound	PCSubstance	PCBioAssay	Structure Search
PubChem3D	Deposit Data	Download Data	Publications	News

Thumbnail image showing the types of data you can obtain for a metabolic pathway in the NCBI BioSystems database, including genes, proteins, small molecules, and related biosystems. Click on image to read more about the BioSystems database.

Biological Systems

A biosystem, or biological system, is a group of molecules that interact directly or indirectly, where the grouping is relevant to the characterization of living matter. The NCBI BioSystems Database provides centralized access to biological pathways from several source databases and connects the biosystem records with associated literature, molecular, and chemical data throughout the Entrez system in order to facilitate computation on biosystems data. BioSystem records list and categorize components (illustrated example), such as the genes, proteins, and small molecules involved in a biological system, along with related biosystems and citations, and allow instant retrieval of the those data sets through a wide range of Links.

About

Search

How To

Help

FAQ

FTP

News

Tools for Discovery

Discover associations
among previously disparate data

Schematic depicting connections among various data types, such as literature, nucleotide and protein sequences, and three-dimensional structures. Click anywhere on this image to open a detailed example of the types of connections that exist and how to access them.

Various data types, such as literature, nucleotide and protein sequences, and three-dimensional structures, are often submitted to public databases independently of each other by different research groups. Yet these data are related through their coverage of the same topic via different research methods. The Structure group contributes to the broader NCBI effort to identify associations among previously disparate data. See an example...

| Revised 28 July 2009 |