CGAP       the Cancer Genome Anatomy Project
Skip Main Navigation
    CGAP HOW TO GenesChromosomesTissuesSAGERNAiPathwaysTools  
SAGE

Human SAGE Genie Tools

Mouse SAGE Genie Tools

Digital Karyotyping

Related Links

Quick Links:

Ludwig Institute For Cancer Research

NCI Logo


Extract SAGE Tags From Sequence Files

What the SAGE Tag Extraction Tool Can Do

The tag extraction tool allows you to extract 10-bp or 17-bp SAGE tags from sequence files that you upload on this page. You may request that linker-similar tags be removed from the results; for this option you may use your own list of linker-similar tags or use default lists. The tag extraction tool will return to you the list of extracted tags as well as a report on the process. The extraction tool also allows you to extract 10-bp tags from a list of 17-bp tags by taking the first 10 base pairs of each 17-bp tag and then collating results.

1. Extract Tags From Sequence Files

1. Prepare a compressed file containing your all sequences in fasta format; each sequence file must have the extension '.seq' before compression. The only compression formats that are accepted are (1) Winzip zip file produced on Windows, and (2) .zip, .gz files produced on Unix/Linux systems. Note that it is not necessary to have a separate file for each fasta sequence; it is possible to have a single '.seq' file containing multiple fasta sequences (or multiple '.seq' files each containing multiple fasta sequences). If you are submitting multiple '.seq' files from a Unix/Linux machine, first use tar to create a single file, which can then be compressed (xxx.tar.zip or xxx.tar.gz). We only process Window's Winzip's zip file and UNIX tar.zip, tar.gz.
Enter the name of the compressed file containing your sequence file(s) or use the "Browse" button to locate the file in a local directory.

3. Chose following one options: Specify your own linker-similar sequences, or specify the default linker-similar sequences, or don't exclude any linker-similar sequences. The default linker-similar lists contain every tag that is a one-bp substition, insertion, or deletion variant of TCCCTATTAA and TCCCCGTACA (short SAGE), or TCGGACGTACATCGTTA and TCGGATATTAAGCCTAG (long SAGE).

Use default:         default short linker-similar list
4.


5.


6.


7.


8. Click "Extract Tags" button:



2. Extract Short Tags From Long Tags

1. Prepare a file containing long tags with their frequencies. Each line in the file must have one tag and its numeric frequency, seaprated by a TAB. Don't compress the file.
2.

3. Click "Extract Tags" button:


If you have comments or questions on this website, contact NCICB Application Support at ncicb@pop.nci.nih.gov.