HIV Databases HIV Databases home HIV Databases home
HIV sequence database



SynchAligns Explanation

This tool aligns two alignments to each other. You have the option to provide both alignments, or, if you provide only one alignment, that alignment will be synchronized to the reference alignment from our database. Synchronization occurs by using one reference sequence from each alignment, aligning those reference sequences to one another, and then adjusting gaps in the two alignments based on the alignment of the two reference sequences.

Reference sequences can be selected in one of two ways. If there is a sequence that is common to both alignments, the user may select this sequence manually. If the two alignments do not share a common sequence, the program automatically chooses the longest sequence from each alignment as the references, aligns them to one another and then adjusts the two submitted alignments to agree with the aligned references.

Input format: The program should correctly read any valid format. Input may be either nucleotide or protein alignments, but both alignments must be of the same type. Each sequence must have a name, so you cannot submit raw sequence files. The files can be in different formats, but the output format will be same as the input format of the first file. Also specify the gap character(s) if it is not a dash (-). More than one gap character may be specified in case your two alignments use different characters.

Reference sequences: The program will synchronize your files using two reference sequences that it selects. If this fails and your alignments share a common reference sequence, you can check the "Reference sequence selection: Manual" button; then you will be asked to identify this sequence in both alignments. In this case, the program does not require that the reference sequences have the same names, but the sequences must be identical in the region where they overlap.

Example

Input:

ref1    JKLMN-OPQR-ST
align1  JKLMNYOPQRYST
        JKLMNYOPQR-ST

ref2    HIJK-LMNOP
align2  HIJKXLMN-P
        --JK-LMNOP
        HIJK-LMNOP

Result after SynchAligns:

ref1    --JK-LMN-OPQR-ST
align1  --JK-LMNYOPQR-ST
        --JK-LMNYOPQRYST
ref2    HIJK-LMN-OP-----
align2  HIJK-LMN-OP-----
        H-JK-LMN-OP-----
        HIJKXLMN-OP-----

If you chose to 'trim alignments to region of overlap', the resulting alignment will be:

ref1    JK-LMN-OP
align1  JK-LMNYOP
        JK-LMNYOP
ref2    JK-LMN-OP
align2  JK-LMN-OP
        JK-LMN-OP
        JKXLMN-OP

In this example, ref1 and ref2 are sequence fragments drawn from the same sequence but which differ in length. Each is aligned to its respective alignment. The location of gaps in the two alignments differ because of a "Y" insertion in align1 and an "X" insertion in align2. The gap character is shown as "-". If your alignment uses a different gap character, enter it in the space provided on the input form. Output consists of the two alignments synchronized and joined into one alignment that you can download. The format of this alignment is the same as the format of your first submitted alignment. You can also download the joined alignment in "pretty printed" or "pretty printed and output aligned" formats.

last modified: Thu Nov 8 11:42 2007


Questions or comments? Contact us at seq-info@lanl.gov.