HIV Databases HIV Databases home HIV Databases home
HIV sequence database



Gene Cutter: Sequence Alignment and Protein Extraction

Purpose: Gene Cutter is a sequence alignment and protein extraction tool. It can be used for any set of nucleotide sequences for HIV-1, HIV-2 or SIV.

Gene Cutter can:

  • align your nucleotide sequences (if they aren't already aligned)
  • clip pre-defined coding regions from a nucleotide alignment
  • codon-align the coding regions
  • generate nucleotide and protein alignments of the cut regions

Details: The reference sequence used by this tool is HXB2(Accession #K03455) for HIV-1 or SMM239(Accession #M33262) for HIV-2 or SIV. Gene coordinates are based on these reference sequences. This version of Gene Cutter doesn't require a reference sequence to be included in your input nucleotide alignment. Gene Cutter will also accept unaligned sequence sets. Gene Cutter uses Hmmer with a training set of the full-length genome alignment and will give a better multiple alignment than many computationally-based alignment programs. Mis-alignments at the ends of a coding region may result in a few amino acids/bases not appearing in the output for that coding region.

In some sequences, an insertion will be compensated within a short distance by a deletion, or vice versa. As these frameshifts may not inactivate the protein, if a compensating mutation is within 5 amino acids of an initial frameshift, the shifted reading frame is left intact. Otherwise, the frame shift is marked with the hash symbol (#), and the translation is continued in the correct reading frame beyond the offending codon. Stop codons are marked by a dollar sign ($).

The best results will be obtained if you submit an alignment that has been hand-aligned and contains the correct reference sequence. For more information, see Gene Cutter Explanation.

Input
Select the organism
Paste your sequences
[Sample Input]
Or upload your file:
Check box if appropriate Sequences are unaligned

Options
Region(s) to align and extract
Insert HXB2(Accession #K03455) for HIV-1 or SMM239(Accession #M33262) for HIV-2 or SIV into the result set
Remove HXB2(Accession #K03455) for HIV-1 or SMM239(Accession #M33262) for HIV-2 or SIV from the result set
Codon align the region

Translation options
Codons containing an IUPAC character are shown as "X".
Codons containing an IUPAC character in a silent position are translated; others are shown as "X".
Codons containing an IUPAC character are translated.
Do not translate to amino acids
Note: codons containing "-" are always translated to either "-" (gap) or "#" (partial codon)

Please be patient. Your input file must download to our server, where the actual work is performed. This can take several minutes. Do not resubmit your sequences; you will not get a result any faster, and you will load up our server and make the process slower.

last modified: Mon Dec 10 10:58 2007


Questions or comments? Contact us at seq-info@lanl.gov.