Entrez-Structure logo
PubMed BLAST OMIM Taxonomy Structure
  Search Entrez  for

MMDB Home

NCBI's Structure Database

MMDB Help

Short Summary

Linking to MMDB

Direct WWW access to the MMDB server

Read About MMDB

Papers About MMDB


PDBeast

Taxonomy in MMDB

Cn3D v4.1

3D-Structure Viewer

VAST

Structure Comparisons

VAST Search

Submit Structure Database Searches

CDD

Conserved Domain Database

Research

Research Topics and the Structure Staff


"Entrez Structure and 3D-Domains Index" Help

Content



Entrez Structure and 3D-Domains Index Overview

The underlying database to support Entrez structure and 3D domains indexing is MMDB database. MMDB (a Molecular Modeling Database) is Entrez's macromolecular 3D Structure database. It contains experimentally determined biopolymer structures obtained from the Protein Data Bank (PDB). Entrez leads to vast information on biological function and molecular evolution by providing a number of powerful search tools useful for identifying structures of interest.

Entrez's Structure Index page is the "home page" for querying 3D structures. For each query, it lists the PDB names and the general descriptions of the structures returned in the query result set. It also provides a number of links leading to further information. On the right hand side of the structure index page, for each returned structure entry, there is an "MMDB" button linking to the MMDB Structure Summary page, and a "Links" menu linking to Entrez 3D Domains Index, to Entrez Protein or Nucleotide Index, to Entrez PubMed Citations Index, or to Entrez Taxonomy Index, etc., where appropriate/applicable.

Entrez's 3D-Domains Index page is the "home page" for querying 3D domains. For each query, it lists the domain names and the general descriptions of the corresponding structures of the 3D domains returned in the query result set. It also provides a number of links leading to further information. On the right hand side of the 3D domains index page, for each returned 3D domain entry, there is an "MMDB" button linking to the MMDB Structure Summary page, a "VAST" button linking to the VAST 3D-Domain Neighbors Summary page, and a "Links" menu linking to Entrez Structure Index, to Entrez Protein or Nucleotide Index, to Entrez PubMed Citations Index, or to Entrez Taxonomy Index, etc., where appropriate/applicable.


How To Make a Query

Go to the entrez page: http://www.ncbi.nlm.nih.gov/entrez

For Structure or 3D-Domains queries, choose the "Search Structure" or "Search 3D Domains" menu item at the top left of the page.

There are four types of queries: (1) string query, (2) integer query, (3) date query, and (4) range query. For all these queries, write down in the text box the token to be queried followed by a field alias in square brackets. All queries are case-insensitive.

By default, a string query without a field alias means query against [ALL]; an integer query without a field alias means query against [UID].

Date queries must have the following formats: YYYY/MM/DD, YYYY/MM/D (single digit month and day are allowed and don't have to be pre-padded by 0), YYYY/M/D, YYYY/MM, or YYYY, etc.

Range queries are constructed by two tokens (a from and a to) separated by a : (colon) to specify the range, followed by a field alias in square brackets. All dates and all 'counts' (like resiude counts, helix counts, etc.) fields can be range queried. Apart from that, there are two additional fields that can be range queried: Resolution [RESO] in Structure Index and MolWeight [MWT] in 3D-Domains Index.

Range queries on Resolutions [RESO] (in angstroms) must have the following format:
fromResolution:toResolution [RESO].

Range queries on MoleculeWeights [MWT] (in daltons) must have the following format:
fromMoleculeWeight:toMoleculeWeight [MWT].

Range queries on Dates has similar format:
FromDate:ToDate [field-alias] (FromDate and ToDate are of form: YYYY/MM/DD, YYYY/MM, YYYY/M, YYYY, etc)

Range queries on 'counts' has format:
FromCount:ToCount [field-alias] (FromCount and ToCount are integers)

Special Notes on Querying PdbChainCode [CHN]: PDB chain code can be a wide variety of characters including white-spaces (which can not be queried in Entrez). Also, PDB chain code is case-sensitive, whereas Entrez search engine is case-insensitive. In order to facilitate queries on special characters such as white-spaces, PdbChainCode queries can be done by inputting either the character itself or its corresponding decimal ASCII code. For example, in order to query PDB chain code 'A', you can input either 'A [CHN]' (which is interpreted by Entrez search engine to mean either 'a' or 'A') or '65 [CHN]' (to mean unambiguously the upper-case 'A' only); on the other hand, to query a white-space PDB chain code, you have only one choice: to input '32 [CHN]' (see examples below). In order to avoid upper/lower case ambiguity, it's recommended that you input PdbChainCode queries with decimal ASCII code.

The following sections on Structure Query Capabilities and on 3D-Domains Query Capabilities list all the field aliases for different fields. One field may have several aliases, in which case, use the one you find most easily memorizable.


Query Examples

For queries on Entrez Structure:

tyrosine kinase
nmr structure [TITL]
1b3o [ACCN]
19741
3.2.1.17 [EC]
3.2.1.- [EC]
3.2.*.* [EC]
5:7 [PCC]
A [CHN]
65 [CHN]
32 [CHN]
2001/07/24:2003/02/16 [MDAT]
1999/07:2000 [PDAT]
1.97:2.14 [RESO]
SO4 [LCOD]
benzamidine [LNAM]
beta-mercaptoethanol [LDES]
isomerase [PCLS]
lysozyme [PCOM]
African Clawed Frog [PSRC]

For queries on Entrez 3D-Domains:

1b3oa1 [NAME]
2001/7/24:2003/02/16 [PRD]
2001:2003/2 [PRD]
fission yeast [ORGN]
pap [PDES]
A [CHN]
65 [CHN]
32 [CHN]
23:32 [RC]
3:7 [HC]
11:29 [MPRC]
11369.52:13521.06 [MWT]


Structure Query Result Entries

The structure result entries include the following:
  • PdbAcc (aliases: [ACCN, PACC, PDBACC]): The four-character identifier assigned by PDB to specify the PDB structure. Clicking on it will go to the corresponding MMDB Structure Summary page.
  • PdbDescr (aliases: [PDSC, PDES]): A brief description of the PDB structure.
  • Uid (aliases: [UID, ID, MMDBID]): The integer assigned by MMDB to uniquely specify the PDB structure.


Structure Query Capabilities

The following fields can be queried in entrez structure index (with field aliases in square brackets; pick one alias that's easily memorized in case multiple aliases are available):
  • All [ALL]: All of the following fields are searched. If a string query is presented without a field alias, by default, [ALL] is searched.
  • Uid [UID, ID, MMDBID]: The integer assigned by MMDB to uniquely specify the PDB structure. If an integer query is presented without a field alias, by default, [UID] is searched. For structures, the UIDs are MMDB IDs.
  • PdbAcc [ACCN, PACC, PDBACC]: The four-character identifier assigned by PDB to specify the PDB structure.
  • EC [EC]: The EC number of the PDB structure. This field can be queried with wild-card feature:
    3.2.1.- [EC]
    3.2.*.* [EC]
    3.2.* [EC]
    and so on. Note the queries 3.2.*.* [EC] and 3.2.* [EC] will return identical set of PDB structures and hence these two queries are equivalent.
  • Resolution [RESO, RESL, RES]: The resolution (in angstroms) of a protein structure. This field can be range queried with the above specified format.
  • ExpMethod [EXPM, EXP]: The experimental method used (X-Ray, NMR, etc.) to characterize the protein structure.
  • Title [TITL, TITLE]: The title of the publication that reported the PDB structure findings.
  • Author [AUTH, AU]: The author of the publication that reported the PDB structure findings.
  • Journal [JOUR, JOURNAL]: The journal of the publication that reported the PDB structure findings.
  • PubDate [PDAT, PDATE, DP]: The date of the publication that reported the PDB structure findings. This field can be range queried with the above specified format.
  • MmdbDate [MDAT, MMDBDATE]: The date when the PDB structure data were loaded into MMDB. This field can be range queried with the above specified format.
  • PdbReleaseDate [PRD, RDAT, PRDATE]: The PDB release date of the PDB structure. This field can be range queried with the above specified format.
  • PdbClass [PCLA, PCLS]: The classification of the PDB structure.
  • PdbSource [PSRC, PSOU]: The sample source of the PDB structure.
  • PdbDescr [PDSC, PDES]: The brief description of the PDB structure.
  • PdbComment [PCOM, PCMT]: The more detailed description of the PDB structure.
  • Organism [ORGN]: The organism and lineage of the PDB structure.
  • PdbChainCode [CHN, CHNC, CCODE]: The 1-letter PDB chain code.
  • LigCode [LCOD, LIGC, LCODE]: The 3-letter code of a ligand in the PDB structure.
  • LigName [LNAM, LIGN, LNAME]: The PDB definition of a ligand in the PDB structure.
  • LigDescr [LDES, LIGD, LDSC, LDESC]: The author's brief description of a ligand in the PDB structure.
  • LigCount [LCOU, LCNT, LCOUNT]: The number of different types of ligands (not the number of ligands) in the PDB structure. This field can be range queried with the above specified format.
  • ModProteinResCount [MPRC, MPRCNT, MPRCOUNT]: The number of modified protein residues in the PDB structure. This field can be range queried with the above specified format.
  • ModDNAResCount [MDRC, MDRCNT, MDRCOUNT]: The number of modified DNA residues in the PDB structure. This field can be range queried with the above specified format.
  • ModRNAResCount [MRRC, MRRCNT, MRRCOUNT]: The number of modified RNA residues in the PDB structure. This field can be range queried with the above specified format.
  • ProteinChainCount [PCC, PCCNT, PCCOUNT]: The number of protein chains in the PDB structure. This field can be range queried with the above specified format.
  • DNAChainCount [DCC, DCCNT, DCCOUNT]: The number of DNA chains in the PDB structure. This field can be range queried with the above specified format.
  • RNAChainCount [RCC, RCCNT, RCCOUNT]: The number of RNA chains in the PDB structure. This field can be range queried with the above specified format.


What are "3D-Domains"?

3D-Domains within individual polypeptide chains in MMDB are identified automatically, using an algorithm that searches for one or more breakpoints, falling between major secondary structure elements, such that the ratio of intra- to inter-domain contacts falls above a set threshold. The 3D-Domains identified in this way provide means to increase the sensitivity of structure neighbor calculations, and to present 3D superpositions based on compact domains as well as on complete polypeptide chains. They are not intended to represent domains identified by comparative sequence and structure analysis, as modules that recur in related proteins, though there is often good agreement between domain boundaries identified by these methods.

The structure similarities among individual chains and their compact 3D-Domains in MMDB are calculated by VAST algorithm, which superposes structures based on the structure alignments of their secondary structure elements.


3D-Domains Query Result Entries

The 3D-domains result entries include the following:
  • Name (alias: [NAME]): The name of the 3D domain. It is a string concatenated by the four-character PDB code of the structure, followed by a one-letter chain code assigned by PDB, and then by an integer domain number on the given chain. Clicking on it will go to the corresponding MMDB Structure Summary page.
  • PdbDescr (aliases: [PDSC, PDES]): The brief description of the PDB structure that has the specified 3D domain.
  • Uid (aliases: [UID, ID, SDI]): The integer assigned by MMDB to uniquely specify the 3D domain. Clicking on it will go to the corresponding VAST 3D-Domain Neighbors Summary page.


3D-Domains Query Capabilities

The following fields can be queried in entrez 3D domains index (with field aliases in square brackets; pick one alias that's easily memorized in case multiple aliases are available):
  • All [ALL]: All of the following fields are searched. If a string query is presented without a field alias, by default, [ALL] is searched.
  • Uid [UID, ID, SDI]: The integer assigned by MMDB to uniquely specify the 3D domain. If an integer query is presented without a field alias, by default, [UID] is searched. For 3D domains, the UIDs are SDIs (structure domain identifiers).
  • MmdbId [MID, MMDB, MMDBID]: The integer assigned by MMDB to uniquely specify the PDB structure.
  • Name [NAME]: The name of the 3D domain. It is a string concatenated by the four-character PDB code of the structure, followed by a one-letter chain code assigned by PDB, and then by an integer domain number on the given chain.
  • PdbAcc [ACCN, PACC, PDBACC]: The four-character identifier assigned by PDB to specify the PDB structure.
  • Title [TITL, TITLE]: The title of the publication that reported the PDB structure findings.
  • Author [AUTH, AU]: The author of the publication that reported the PDB structure findings.
  • PubDate [PDAT, PDATE, DP]: The date of the publication that reported the PDB structure findings. This field can be range queried with the above specified format.
  • MmdbDate [MDAT, MMDBDATE]: The date when the PDB structure data were loaded into MMDB. This field can be range queried with the above specified format.
  • PdbReleaseDate [PRD, RDAT, PRDATE]: The PDB release date of the PDB structure. This field can be range queried with the above specified format.
  • PdbChainCode [CHN, CHNC, CCODE]: The 1-letter PDB chain code.
  • PdbClass [PCLA, PCLS]: The classification of the PDB structure.
  • PdbSource [PSRC, PSOU]: The sample source of the PDB structure.
  • PdbDescr [PDSC, PDES]: The brief description of the PDB structure.
  • PdbComment [PCOM, PCMT]: The more detailed description of the PDB structure.
  • Organism [ORGN]: The brief description of organism and lineage of the PDB structure.
  • DomainNo [DN, DNUM]: The domain number (across a given chain on the PDB structure) of the 3D domain.
  • CumulDomainNo [CDN, CDNM, CNUM]: The cumulative domain number (across all chains on the PDB structure) of the 3D domain.
  • ModProteinResCount [MPRC, MPRCNT, MPRCOUNT]: The number of modified protein residues in the PDB structure. This field can be range queried with the above specified format.
  • HelixCount [HC, HCNT, HCOUNT]: The number of alpha-helices on the 3D domain. This field can be range queried with the above specified format.
  • StrandCount [SC, SCNT, SCOUNT]: The number of beta-strands on the 3D domain. This field can be range queried with the above specified format.
  • ResCount [RC, RCNT, RCOUNT]: The number of residues in the 3D domain. This field can be range queried with the above specified format.
  • MolWeight [MWT]: The molecular weight (in daltons) of the 3D domain. This field can be range queried with the above specified format.



Updated

1/11/2005


Privacy statement

Disclaimer

 
Help Desk NCBI NLM NIH Credits