CGAP       the Cancer Genome Anatomy Project
    CGAP HOW TO GenesChromosomesTissuesSAGEPathwaysTools  
SAGE

Human SAGE Genie Tools

Mouse SAGE Genie Tools

Digital Karyotyping

Related Links

Quick Links:

NCI Logo


All About the SAGE Digital Gene Expression Display Tool

Overview

The SAGE Digital Gene Expression Displayer analyzes the differences in SAGE tag expression between two pools of libraries. It evaluates the statistical significance of the differences using the sequence odds ratio and a Bayesian test .

What to Put in the Query Fields

Query Field Options
1. List libraries bySelect any option (tissue type, histology, or library name) to organize the list of libraries for review on the set-up page.
2. Pools A and B  
i) Tissue Type There are 4 options:
  • The default setting is all tissues (nothing is highlighted) with the "Include" button selected. This searches all tissue types.
  • Keep the default setting of all tissues, check "Exclude", and highlight one or more** tissues. This will search all the tissues in the list excluding the highlighted item(s).
  • Select a single tissue and keep the "Include" button checked.
  • Select two or more tissues and keep the "Include" button checked.
ii) Exclude Cell Line The default setting is to exclude all libraries constructed from cell lines.
iii) Tissue Histology The default setting (nothing selected) includes both normal and cancer. Either one may be selected separately.
iv) Library Name Enter a full or partial library name, e.g. SAGE_Brain_astrocytoma_gradeIIIB_H1020 or astrocytoma.

** To choose multiple items in a select box, hold down the following keys together as you click on each item:
  • In a PC: [CTRL] and [Alt]
  • In a MAC: [Alt] and [Apple]

Examples of SAGE DGED Queries

Below are some examples of queries built with the SAGE DGED.

Pool APool B
Cancer colon tissue Normal colon tissue
Cancer prostate tissue Any normal tissue, excluding prostate
Any cancer tissue Any normal tissue

Reviewing Selected Libraries in Pools A and B

Having selected the criteria for Pools A and B and pressed Submit Query, the "Review of Library Pools for SAGE DGED" page appears, which contains:

  • The option to specify the expression factor (F). The expression factor, in conjunction with the significance filter (P), determines which results are reported. A result is reported if the odds ratio is significantly greater than F or significantly less than 1/F. F is set by default to "2" but this number may be set to any number greater than or equal to 1. As F increases, fewer results will be reported.
  • The option to specify the significance filter (P). The significance filter, in conjunction with the expression factor (F), determines which results are reported. A result is reported if its significance is less than P. P is set by default to ".01" but this number may be set to any number from 0 to 1. As P increases, more results will be reported.
  • Libraries chosen for each pool listed in a table.

    • The first column is the Target Pool (A) and the second column is Background (B). The checked boxes indicate which library belongs to which pool.
    • The next column provides the number of sequences in the library.
    • The last column lists keywords which describe the library.

    It may be necessary to resort the libraries, e.g., to have all of Target followed by all Background libraries. To do this, click the back button on your browser and choose the appropriate criteria in #1. Press submit again.

Carefully review the libraries before proceeding. Check there are no libraries of "pooled" tissues or whole fetal tissue which may invalidate the results. Check that a library is not in both groups. Remove certain libraries to narrow your original selection. When you are satisfied, press Submit Query.

Understanding the SAGE DGED Results Page

The results page contains:

  • The UniGene Build number
  • The total number of sequences or tags in each pool
  • The total number of libraries in each pool
  • A table listing the genes or tags found to be expressed with a statistically significant difference between pools A and B.The following information is provided:

    FieldDescription
    TagThe tag is hyperlinked to complete mapping information for the tag.
    GeneThe gene symbol (or the UniGene cluster number, in the case of an anonymous gene) that has been mapped to the tag, using the best gene for the tag. If multiple genes are the best choices for the tag, then the gene symbol (or cluster number) is followed by "...". In some cases, a tag has been associated with an accession number for a transcript or EST but not with a UniGene cluster; in other cases, the tag has been associated with neither a UniGene cluster nor an accession.
    Libs A (or B)The number of libraries which contain this tag
    Tags A (or B)Tag frequency in either Pool A or B.
    Tag Odds (A:B)The odds ratio uses a simple mathematical formula to provide a measure of the relative amount of a tag in pool A to Pool B.
    P value
    A test of probability: the smaller the number, the more likely the result is not due merely to sampling error

Used together, the seqs odds ratio and the significance test provide a measure of confidence that the difference in the expression of a gene or tag is "real" and not due sampling error.

Return to the SAGE DGED Tool


If you have any questions, comments, or need information about CGAP, please contact the NCI CGAP Help Desk.