PubMed Nucleotide Protein Genome Structure Taxonomy

Model Maker Help Document Revised January 21, 2004

Model Maker allows you to view the evidence that was used to build a gene model on assembled genomic sequence, and to create your own version of the model by selecting exons of interest. To see a Model Maker example, follow the "mm" link beside any gene annotated on the human "Gene Sequence" map. Model Maker is accessible from sequence maps that were analyzed at NCBI and displayed in Map Viewer. Only some of the organisms represented in Map Viewer have such sequence maps, and therefore Model Maker links. The Map Viewer help document includes links to organism-specific data and search tips files that describe the maps available for each organism.

  1. Overview
  2. How to Access Model Maker
  3. What You See
  4. What You Can Do

Overview back to top

Model Maker allows you construct an mRNA sequence from genomic data. It displays the evidence (mRNAs, ESTs, and gene predictions) that was aligned to assembled genomic sequence in order to build a gene model, and allows you to edit the model by selecting or removing putative exons. You can then view the mRNA sequence and potential ORFs for the edited model, and save the mRNA sequence data for use in other programs.

How to Access Model Maker back to top

Model Maker is accessible from sequence maps that were analyzed at NCBI and displayed in Map Viewer. Only some of the organisms represented in Map Viewer have such sequence maps, and therefore Model Maker links. The Map Viewer help document includes links to organism-specific data and search tips files that describe the maps available for each organism.

When Model Maker is available for an organism, it can be accessed through several links, described below. All of the links appear when a sequence map analyzed at NCBI is displayed as the master map.

  • an mm link (for Model Maker) appears in the "Links" column near the right edge of the display for each map element

  • a seq link also appears in the "Links" column near the right edge of the display for each map element. The seq link leads to a page that allows you to download or view data for the genomic region that contains that map element. That page also includes a model maker link for the genomic region.

  • a Download/View Sequence/Evidence link appears in the header area of the page, above the graphic display of a chromosome/region. That link leads to a page that allows you to download or view data for every contig in the chromosome region being displayed. A model maker link is provided for each contig on that page.

To see examples of these links, display the human "Gene_Sequence" map, which is described in the Homo sapiens data and search tips file.

What You See back to top

Four Parts of Display back to top

The Model Maker display has four parts:

  1. Evidence - graphic depiction of the expressed sequences and gene models that align to the selected genomic region (more...)


  2. Putative Exons (Graphic View) - graphic depiction of all putative exons, based on all the alignments (to the current strand) shown in part 1 of the display (more...)


  3. Your Model - graphic view of the putative exons you have selected for inclusion in your model, and the corresponding mRNA sequence and three-frame translations (more...)


  4. Putative Exons (Table View) - a tabular version of part 2 of the display (more...)

The actions you can take in each part are described below.

Legend back to top

What You Can Do back to top

Evidence back to top

The Evidence section graphically displays the expressed sequences and gene models that align to both strands of the selected genomic region.

The expressed sequences include all human mRNAs from RefSeq and GenBank, and all ESTs (if you choose to display them), that were available to the genome assembly and annotation pipeline at the time of the data freeze. Gene models include those built by Gnomon, and the final models retained by NCBI based on this evidence.

Evidence on the opposite strand is shown above the black line that represents the genomic contig (NT_*) being displayed, and evidence on the current strand is shown below the black line. The putative exons part of the display (graphic and table view) are based only the evidence aligned to the current strand.

The actions you can take in the evidence section are described below.

View Contig Sequence Record back to top

The contig accession (NT_*) links to the complete sequence record in Entrez.

The mv, sv, ev, and seq links beside the NT_* accession lead to other views of the contig:
mv map viewer Graphical display of the current contig region on several sequence maps that have been aligned to each other and annotated with different information. Controls in the side bar of the display allow you to zoom in or out to a smaller or larger region. Clicking on the graphic for a map opens a dialog box that allows you to zoom further in. Additional maps can be selected for display from the "Maps and Options" dialog box. (more...)
sv sequence viewer Graphical display of the contig, inclucing the position of the map element within the sequence region, and biological features such as coding region (CDS), RNA, and other features that have been annotated on that region. A 2 Kb section of sequence is shown below that, with corresponding graphic annotations of the features. The left and right arrows at either end of the sequence data allow you to move upstream and downstream.
ev evidence viewer Graphical display of the biological evidence supporting a particular gene model. It displays all RefSeq models, GenBank mRNAs, annotated known or potential transcripts, and ESTs that align to the genomic sequence region of interest. (more...)
seq sequence download

Opens a form that allows you to download a region of a chromosome. The form has two parts: (1) the top part allows you to enter chromosome coordinates in text boxes, and (2) the bottom part displays the NT_* contigs (or portions of them) that are found in that chromosome region.

Note that part 1 shows the position (base span) of the region on the chromosome, and part 2 shows the position of the region on the contig. The "strand" column for each contig shows whether that contig is on the plus or minus strand of the chromosome. Therefore, if a contig is on the minus strand, increasing the value of the 3' chromosome coordinate will decrease the value of the 5' contig coordinate.

The options to "Display, Save to Disk, and View Evidence" allow you to view the individual contigs in the region (or portions of them, depending on the chromosome region specified).

By default, the seq link beside each gene displays the chromosome and contig coordinates for the span of that gene. To view/save additional sequence data upstream and downstream of the gene, simply adjust the chromosome coordinates and press the "Change Region" button. Note that the contig coordinates will also change.



View Evidence Sequence Records (mRNAs and ESTs) back to top

Click on the graphic for a particular mRNA or EST to open its sequence record in Entrez.

Set:    select all exons from an mRNA back to top

Follow "set" link beside an mRNA, EST, or model in the evidence section to place all of its exons in your model. Changing to the set from a different mRNA/EST/model will completely replace an earlier selection.

Hits:   view alignment positions of mRNA to contig back to top

Follow "hits" link for mRNA, EST, or model in the evidence section to see the base span alignments between that item and the genomic contig.

Add ESTs back to top

The "Add ESTs" option displays the ESTs that align to both strands of the genomic region shown. ESTs on the opposite strand are shown above the black line that represents the genomic contig (NT_*) being displayed, and ESTs on the current strand are shown below the black line.

View Opposite Strand back to top

The change strand option flips the display, so any evidence that was shown below the black line representing the genomic contig (NT_*) is now shown above it, and vice versa.

If all the evidence now appears above the black line, and none below it, the putative exons parts of the display (graphic and table view) will be blank. This is because the putative exons are based only the evidence aligned to the current strand (i.e., below the black NT_* line).

Extend Region Shown    <<< downstream  and  upstream >>> back to top

Pressing the  <<< and  >>>  arrows increases the amount of data (base span) shown by 50% in the direction chosen. The arrows are found at either end of the black line that represents the NT_* contig.

Putative Exons (GraphicView) back to top

This section graphically depicts all putative exons based on all alignments to the current strand of the contig, shown in part 1 of the display. Therefore, adding or removing ESTs from the display can affect the number of putative exons shown.

Evidence that aligns to the opposite strand (i.e., above the black line representing the NT_* contig) will not be reflected in the putative exons.

Combination of exons from different evidence into a single putative exon:
Exons that have the same start and stop positions make one putative exon.  In addition, first and last exons in any mRNA/EST/model have only one splice position; such exons from various alignments are also collapsed if their splice position is shared. Single exon mRNA/EST/models have no splice positions; such exons are also collapsed if they intersect. A table view of the detailed information for each exon is provided at the end of the display.

The actions you can take in this graphic view section are described below.

Add or Remove Exons back to top

Click on the dark green putative exons in the graphic display to add or remove those exons from your model.

Your Model back to top

This part of the page is blank until you select the putative exons that you would like in your model. Exons can be selected in several ways, described below. Once you have selected exons, you will see a graphical view of the putative exons you have selected, the corresponding mRNA sequence, and the three-frame translations.

Additional information about each item in this section of the display is below.

Graphic View of Exons back to top

This part of the page is blank until you select putative exons in any of several ways:

  • select a set of exons from an individual alignment in the Evidence section. (If you have already chosen exons in another way, selecting a set replaces any previous exons you chose.)
  • select or remove individual exons by clicking on putative exons in the graphic view or by using the check boxes in the table view.
  • clear the section to begin again.

Your mRNA Sequence back to top

Once you have selected exons in any of the ways described above, Model Maker will show the merged sequence data (i.e., mRNA) from the selected exons in a text box.

You can then Save your mRNA sequence to a file in FASTA format for use with other programs such as BLAST, primer design programs, etc.

The ORF Finder option leads to a graphic view of the ORFs from 6-frame translations. Clicking on the graphic for any ORF will display the corresponding nucleotide sequence data and amino acid translation. The deduced amino acid sequence can be saved in various formats and searched against the sequence database using the WWW BLAST server.  (more about ORF Finder)

3-Frame Translations back to top

The Frame1 ORF, Frame2 ORF, and Frame3 ORF text boxes show the translations of your mRNA in 3 reading frames, and the number of amino acids in the longest ORF from each translation. The longest ORF in each translation is shown in upper case letters. The number of amino acids in that ORF is calculated from stop to stop codon, not from the methionine start codon to the stop codon. The count does not include the stop codons.

Putative Exons (TableView) back to top

This section presents a table view of all putative exons shown in part 2 of the display (putative exons graphic view).  The table view also shows detailed information for each exon, described below, and allows you to add or remove exons from your model.

Interpretation of Detailed Information for Each Exon back to top

The detailed information for each exon includes:

  • start and stop positions of that exon on the genomic contig (NT_* record)
  • first three and last three bases of each exon
  • two bases immediately upstream and downstream of the exon
  • the exon numbers, both upstream and downstream, to which this exon has been spliced in any evidence shown in part 1 of the display

If a putative exon represents two or more exons of different lengths (as explained above in the putative exons graphic view), the table view will show the information for both the longest and shortest exons.  Examples are below.

Examples back to top

Example 1:  Putative exon with single known start and stop position. Example below is from the Model Maker view of the PINK1 gene on human chromosome 1:

               4  1 or 3 <= AG|GCA 574116-573829 TCG|GT => 5
This is putative exon number 4, based on the evidence being displayed in Model Maker. The start position of the exon on the genomic contig is 574116. The first three nucleotides in the exon are GCA, with AG immediately upstream.  The stop position is 573829. The last three nucleotides are TCG, with GT immediately downstream. This exon has been spliced to exons 1 or 3 upstream, and to exon 5 downstream, based on the evidence shown in part 1 of the display.

Example 2:   Putative exon that represents two or more exons of different lengths (as explained above):

               123 GG|AAA...TC|CAT 456-788 TTG|AG...TTT|CC 986
This example is most easily read from the center. The shortest exon found in the evidence aligns to bases 456-788 of the genomic contig. The first and last three bases of that exon are CAT and TTG, respectively. The two bases immediately upstream and downstream are TC and AG, respectively. The longest exon found in the evidence aligns to bases 123-986 of the genomic contig. The first and last three bases of that exon are AAA and TTT, with GG and CC immediately upstream and downstream, respectively.

In some cases, you might see the nucleotides written in a format such as:   GG|A|CTG.  That indicates a single nucleotide difference (the A) between the shortest and longest exons. A format such as GG|AA|CTG indicates a difference of two nucleotides, and so on. There can be up to eight letters between the vertical bars. If exon lengths differ by more than eight nucleotides, an ellipsis (...) is used to show the intervening nucleotides, as in the example above.

Example 3:   First and last putative exons derived from exons of different lengths that share a single splice junction (as explained above). Example below is from the Model Maker view of the PINK1 gene:

               1  578503 TG|CGC...TG|TTG 578479-578023 CAG|GT => 4
              11            10 <= AG|AGA 561524-560638 TGC|AG...CTG|AG 560441
Putative exons 1 and 11 are the first and last of the PINK1 gene, based on the evidence displayed in the Model Maker. The shortest and longest first exons found in the evidence align to bases 578479-578023 and 578503-578023 of the genomic contig, respectively. Both share the same 3' splice junction. The shortest and longest last exons align to bases 561524-560638 and 561524-560441, respectively, sharing the same 5' splice junction. The other information can be interpreted as explained in the examples above.

Note:   The exons displayed by ModelMaker for the PINK1 gene shown in the example above might change over time as the genome sequence data and build and annotation procedures continue to evolve.

Add or Remove Exons back to top

The check boxes beside each putative exon allow you to include or remove that exon from your model. The exon numbers that are shown upstream and downstream of the detailed information for any exon also act as toggle switches to include or remove those exons from your model.

Questions or Comments?
Write to the NCBI Service Desk