HIV Databases HIV Databases home HIV Databases home
HIV sequence database



Common Sequence Formats

Below are examples of some of the alignment formats our web interfaces recognize. In each example, two or more sequences are displayed in the given format. To convert the format of your sequence, we provide a Format Converter tool. More format descriptions can be found at the Pasteur Institute website: File Formats.

IMPORTANT! Do not convert your sequence files to a ".doc" file or open them in any word-processing program. One common reason for our web-based tools to fail is the corruption of the sequence format by changes introduced by word-processing programs.

 

FastA:

>CPZANT
ATGGGAGCGGGGGCGTCTGTTTTGAGGGGAGAGAAGCTAGATACATGGGA
AAGTATCAGGCTTCGGCCCGGTGGCAAGAAAAAGTACATGATAAAACATC
TGGTTTGGGCAAGATCGGAGCTGCAGCGTTTTGCGCTCAGCTCCTCCCTT
CTAGAAACATCAGAAGGTTGTGAAAAGGCTATCCATCAATTGAGCCCTTC
CATAGAAATAAGATCCCCTGAAATAATATCTTTGTTTAACACCATTTGTG
>U455
ATGGGTGCGAGAGCGTCAGTATTAAGCGGGAAAAAATTAGATTCATGGGA
GAAAATTCGGTTAAGGCCAGGGGGAAACAAAAAATATAGACTGAAACATT
TAGTATGGGCAAGCAGGGAGCTGGAAAAATTCACACTTAACCCTGGCCTT
TTAGAAACAGCAGAAGGATGTCAGCAAATACTGGGACAATTACAACCAGC
TCTCCAGACAGGAACAGAAGAACTTAGATCATTATATAATACAGTAGCAG

 

Table:

CPZANT	ATGGGAGCGGGGGCGTCTGTTTTGAGGGGAGAGAAGCTAGATACATGGGAAAGTATCAGGCTTCGGCCCGGTGGCAAGAAAAAGTACATGAT
U455	ATGGGTGCGAGAGCGTCAGTATTAAGCGGGAAAAAATTAGATTCATGGGAGAAAATTCGGTTAAGGCCAGGGGGAAACAAAAAATATAGACT

 

IG (IntelliGenetics):

;
CPZANT
ATGGGAGCGGGGGCGTCTGTTTTGAGGGGAGAGAAGCTAGATACATGGGA
AAGTATCAGGCTTCGGCCCGGTGGCAAGAAAAAGTACATGATAAAACATC
TGGTTTGGGCAAGATCGGAGCTGCAGCGTTTTGCGCTCAGCTCCTCCCTT
CTAGAAACATCAGAAGGTTGTGAAAAGGCTATCCATCAATTGAGCCCTTC
CATAGAAATAAGATCCCCTGAAATAATATCTTTGTTTAACACCATTTGTG
;
U455
ATGGGTGCGAGAGCGTCAGTATTAAGCGGGAAAAAATTAGATTCATGGGA
GAAAATTCGGTTAAGGCCAGGGGGAAACAAAAAATATAGACTGAAACATT
TAGTATGGGCAAGCAGGGAGCTGGAAAAATTCACACTTAACCCTGGCCTT
TTAGAAACAGCAGAAGGATGTCAGCAAATACTGGGACAATTACAACCAGC
TCTCCAGACAGGAACAGAAGAACTTAGATCATTATATAATACAGTAGCAG

 

MSF:

            1                                                   50
CPZANT      ATGGGAGCGG GGGCGTCTGT TTTGAGGGGA GAGAAGCTAG ATACATGGGA
U455        ATGGGTGCGA GAGCGTCAGT ATTAAGCGGG AAAAAATTAG ATTCATGGGA
BZ126B      ATGGGTGCGA GAGCGTCAGT ATTAAGCGGG GGAAAATTAG ATGCTTGGGA
IBNG        ATGGGTGCGA GAGCGTCAGT ATTAAGTGGG GGAAAATTAG ATGCATGGGA
VI59        ATAGGTGCGA GAGCGTCAGT ATTAAGCGAG GGAAAATTAG ATGCATAGGA
VI310       ATGGGTGCGA GAGCGTCAGT ATTAAGCGGG GGAAAATTAG ATAAGTGGGA
SF2         ATGGGTGCGA GAGCGTCGGT ATTAAGCGGG GGAGAATTAG ATAAATGGGA
BZ167       ATGGGTGCGA GAGCGTCGGT ATTAAGCGGG GGAGAATTAG ATAGGTGGGA
PH153       ATGGGTGCGA GAGCGTCAGT ATTAAGCGGG GGGAAATTAG ATAGATGGGA
PH136       ATGGGTGCGA GAGCGTCAGT ATTAAGCGGG GGAGAATTAG ACAGATGGGA
TB132       ATGGGTGCGA GAGCGTCAGT ATTAAGCGGG GGACAATTAG ATAGATGGAA

 

GDE:

#CPZANT
ATGGGAGCGGGGGCGTCTGTTTTGAGGGGAGAGAAGCTAGATACATGGGA
AAGTATCAGGCTTCGGCCCGGTGGCAAGAAAAAGTACATGATAAAACATC
TGGTTTGGGCAAGATCGGAGCTGCAGCGTTTTGCGCTCAGCTCCTCCCTT
CTAGAAACATCAGAAGGTTGTGAAAAGGCTATCCATCAATTGAGCCCTTC
CATAGAAATAAGATCCCCTGAAATAATATCTTTGTTTAACACCATTTGTG
#U455
ATGGGTGCGAGAGCGTCAGTATTAAGCGGGAAAAAATTAGATTCATGGGA
GAAAATTCGGTTAAGGCCAGGGGGAAACAAAAAATATAGACTGAAACATT
TAGTATGGGCAAGCAGGGAGCTGGAAAAATTCACACTTAACCCTGGCCTT
TTAGAAACAGCAGAAGGATGTCAGCAAATACTGGGACAATTACAACCAGC
TCTCCAGACAGGAACAGAAGAACTTAGATCATTATATAATACAGTAGCAG

 

PHYLIP (Interleaved):

     11    1683  I
CPZANT     ATGGGAGCGG GGGCGTCTGT TTTGAGGGGA GAGAAGCTAG ATACATGGGA
U455       ATGGGTGCGA GAGCGTCAGT ATTAAGCGGG AAAAAATTAG ATTCATGGGA
BZ126B     ATGGGTGCGA GAGCGTCAGT ATTAAGCGGG GGAAAATTAG ATGCTTGGGA
IBNG       ATGGGTGCGA GAGCGTCAGT ATTAAGTGGG GGAAAATTAG ATGCATGGGA
VI59       ATAGGTGCGA GAGCGTCAGT ATTAAGCGAG GGAAAATTAG ATGCATAGGA
VI310      ATGGGTGCGA GAGCGTCAGT ATTAAGCGGG GGAAAATTAG ATAAGTGGGA
SF2        ATGGGTGCGA GAGCGTCGGT ATTAAGCGGG GGAGAATTAG ATAAATGGGA
BZ167      ATGGGTGCGA GAGCGTCGGT ATTAAGCGGG GGAGAATTAG ATAGGTGGGA
PH153      ATGGGTGCGA GAGCGTCAGT ATTAAGCGGG GGGAAATTAG ATAGATGGGA
PH136      ATGGGTGCGA GAGCGTCAGT ATTAAGCGGG GGAGAATTAG ACAGATGGGA
TB132      ATGGGTGCGA GAGCGTCAGT ATTAAGCGGG GGACAATTAG ATAGATGGAA

 

PHYLIP (Sequential):

     11    1683
CPZANT     ATGGGAGCGG GGGCGTCTGT TTTGAGGGGA GAGAAGCTAG ATACATGGGA
           AAGTATCAGG CTTCGGCCCG GTGGCAAGAA AAAGTACATG ATAAAACATC
           TGGTTTGGGC AAGATCGGAG CTGCAGCGTT TTGCGCTCAG CTCCTCCCTT
           CTAGAAACAT CAGAAGGTTG TGAAAAGGCT ATCCATCAAT TGAGCCCTTC
           CATAGAAATA AGATCCCCTG AAATAATATC TTTGTTTAAC ACCATTTGTG
           TTCTGTGGTG CGTACATAAA GGGGAAAAGA TAAAAGACAC AGAACAAGCC
           GTTAAAACAG TGAAAATGAA AGTAATGCAG ACACAAGCAG AAACAGGAAG
           TAGCCAAACC GCAAGCAGAG GCATGCTTCT GCGGCTGCTC CTGTTAAACA
           AACAGTGGTG TCAGCGACAT CTTAGTGGCG AAGGGAGAAA TTACCCCATC
           ATAGTGGATG CAGGAGGAAT AGCAAGGCAT CAGCCACTGA CACCAAGAAC
           CTTAAATGCC TGGGTAAAGT GTGTAGAAGA GAAAAATTTC AATCCAGAAG
           TCATCCCTAT GTTTTCTGCT TTATCAGAAG GGGCAACTCC TCATGATTTA
           AACACCATGC TTAATGCAGT TGGGGACCAT CAAGGAGCCA TGCAGGTGCT
           AAAAGAAGTA ATCAATGAGG AAGCAGCTGA GTGGGATAGG TTACACCCCA
           CTCATGCAGG ACCAGTACAG GCAGGACAAT TAAGGGAACC AACAGGAAGT
           GATATAGCAG GGACAACAAG CACAGTGCAG GAGCAGATGC AATGGATGTC
           AACACCTCAA CAGAATGGAG GAGTCCCAGT AGGGGACATC TATAAGAGAT
           GGATCATCAT GGGATTAAAT AAGGTGGTCA GGANGTATAG TCCAGTCAGC
           ATTCTAGAGA TAAAACAAGG ACCAAAAGAG CCCTTCAGAG ATTATGTGGA
           TAGATTCTAT AAAACAATTA GAGCAGAACA GGCTTCACAG CCTGTGAAAG
           CCTGGATGAC AGAAACCTTG TTAATCCAAA ATGCCAATCC AGATTGCAAA
           CACATCCTGA AGGCTTTGGG AACAGGAGCC TCCTTAGAAG AAATGTTAAC
           AGCTTGTCAA GGAGTAGGAG GCCCAGCCCA TAAGGCAAGA GTGTTGGCAG
           AAGCTATGGC TTCTGCTAAT AAT---GCAC AG------GG AACCGCA---
           GTCTTTCTGC AGAGAGGCAA TGGAAATAGA GGAGGAAAAA GACCTCTCAA
           ATGTTTTAAC TGCGGTAAAG AGGGCCATAC TGCAAGAAAT TGCAAGGCCC
           CAAGAAGGAA AGGCTGCTGG AGATGTGGAC AGGAAGGACA CCAGCTTAAA
           AACTGTCCAG CAACAAATAC AGGAAAAGTA AATTTTTTAG GGAAACCGAC
           CCCCACGTGG TGGGGGTGCA GACCAGGGAA CTTTGTGCAG AAGGAGGAAG
           TAGTG----- ---------- ---------- ---------- -GAGCCAACA
           GCTCCACCCA TAGAG----- ---------- ------ATCT AT--------
           -CAGGAGGAG CACAAG---A GGACT----- -------CAG AAGGGTCTCA
           AGGGGGAG-- ---------- GAGGAACTA- --CCTCCCTC GTATTCCCTG
           AAATCCCTCT TTGGCAAAGA CCAATGA--- ---
U455       ATGGGTGCGA GAGCGTCAGT ATTAAGCGGG AAAAAATTAG ATTCATGGGA
           GAAAATTCGG TTAAGGCCAG GGGGAAACAA AAAATATAGA CTGAAACATT
           TAGTATGGGC AAGCAGGGAG CTGGAAAAAT TCACACTTAA CCCTGGCCTT
           TTAGAAACAG CAGAAGGATG TCAGCAAATA CTGGGACAAT TACAACCAGC
           TCTCCAGACA GGAACAGAAG AACTTAGATC ATTATATAAT ACAGTAGCAG
           TCCTCTATTG TGTACATCAA AGGATAGATG TAAAAGACAC CAAGGAAGCT
           TTAAATAAAA TAGAGGAAAT GCAAAATAAG AACAAGCAAA GG--------
           ---------- ACACAGCAGG CAGCAGCT-- ----AACACA ---GGAAGC-
           ---------- ---------- ---------- --AGTCAAAA TTACCCCATA
           GTGCAAAATG CACAAGGGCA ACCAGTACAC CAGGCCTTAT CACCTAGGAC
           CTTGAATGCA TGGGTGAAAG TAGTAGAAGA CAAGGCTTTC AGCCCAGAAG
           TAATACCCAT GTTTTCAGCA TTATCAGAGG GAGCCACCCC ACAAGATTTA
           AATATGATGC TGAATGTAGT GGGGGGACAC CAGGCAGCTA TGCAAATGTT
           AAAAGATACC ATCAATGAGG AAGCTGCAGA GTGGGACAGG TTACATCCAG
           TGCATGCAGG GCCTATTCCA CCAGGCCAGA TGAGAGAACC AAGGGGAAGT
           GACATAGCAG GAACTACTAG CACCGTTCAA GAACAAATAG GATGGATGAC
           AGGC------ ---AATCCAC CTATCCCAGT GGGAGACATC TATAGAAGAT
           GGATAATCCT GGGATTAAAT AAAATAGTAA GAATGTATAG CCCTGTTAGC
           ATTTTGGACA TAAGACAAGG GCCAAAAGAA CCCTTCAGGG ATTATGTAGA
           TAGATTCTTT AAAACTCTCA GAGCTGAGCA AGCTACACAG GATGTAAAAA
           ACTGGATGAC AGAAACCTTG CTGGTCCAAA ATGCGAATCC AGACTGTAAG
           TCCATTTTAA GAGCATTAGG GCCAGGGGCT ACATTAGAAG AAATGATGAC
           AGCATGCCAG GGAGTGGGAG GACCCGGCCA TAAAGCAAGG GTTTTGGCTG
           AGGCAATGAG TCAAGTA--- ------CAAC AG---ACAAG C---------
           ATAATGATGC AGAGAGGCAA TTTT---AGG GGCCCGAGAA GA---ATTAA
           GTGTTTCAAC TGTGGCAAAG AAGGACACCT AGCCAAAAAT TGTAGGGCCC
           CTAGGAAAAA GGGCTGTTGG AAATGCGGGA AAGAAGGACA CCAAATGAAA
           GACTGCACT- -----GAG-- -AGACAGGCT AATTTTTTAG GGAAAATTTG
           GCCTTCCAAC AAGGGG---A GGCCAGGGAA TTTTCCTCAG AGCAGACCA-
           ---------- ---------- ---------- ---------- -GAGCCAACA
           GCCCCACCAG CAGAA----- ---------- ------ATCT TT---GGGAT
           GGGGGAAAAG ATGACC---T CCCCT----- -------GCG AAACAGGAGC
           TGAAAGAC-- ---------- AGGGAACAGA CT---CCTTT AGTTTCCCTC
           AAATCACTCT TTGGCAACGA CCCCTTGTCA CAG

 

SLX:

#=RF        xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.xxxxxxxxxxxxxx
DH123       .TGGAAGGGCTAATTTACTCCCAGAAAAGACAAGAtATCCTTGACCTGTG
B-MANC      ..................................................
B-HAN       .TGGAAGGGTTAATTTACTCCCCAAAAAGACAAGAgATCCTTGATCTGTG
B-OYI       .TGGAAGGGCTAATTTACTCCCAGAAAAGACAAGAtATTCTTGATCTGTG
B-AUMBCC54  .TGGAAGGGCTAATACGCTCCCAAAGAAGACAAGAtATCCTTGATCTGTG
F-93BR020   .TGGAAGGGTTAATTTACTCCAAGAGAAGACAAGAgATCCTTGATCTGTG
BF-93BR029  .TGGAAGGGTTAATTTATTCCAAGAAAAGACAAGAgATCCTTGATCTGTG
D-94UG114   .TGGAAGGGTTAGTTTGGTCCCCGAAAAGACAAGAgATCCTTGATCTTTG
B-WR27      .TGGAAGGGCTAATTTACTCCCAGAAAAGACAAGAtATCCTTGATCTGTG
D-ELI       .TGGAAGGGCTAATTTGGTCCAAAAAGAGACAAGAgATCCTTGATCTTTG
D-NDK       .TGGAAGGGCTAATTTGGTCCAAGAAAAGACAAGAgATCCTTGATCTTTG
B-LAI       .TGGAAGGGCTAATTCACTCCCAACGAAGACAAGAtATCCTTGATCTGTG
B-BCSG3     .TGGAAGGGCTAATTTTCTCCCAAAGAAGACAAGAtATCCTTGATCTGTG
B-D31       .TGGAAGGGCTAGTTCACTCCCAAAAAAGACAAGAcATCCTTGACCTGTG
B-RF        .TGGATGGGCTAGTGTTCTCCCAGAAAAGACAAGAtATCCTTGATCTGTG
B-JRFL      .TGGAAGGGCTAATTCACTCACAGAAAAGACAAGAtATCCTTGATCTGTG
D-84ZR085   .TGGAAGGGCTAGTTTACTCCCAGAAAAGACAAGAtATCCTTGATCTTTG
B-HIVMN     .TGGATGGGTTAATTTACTCCCAAAAGAGACAAGAcATCCTTGATCTGTG

 

last modified: Tue Apr 22 12:43 2008


Questions or comments? Contact us at seq-info@lanl.gov.