Scientific Supercomputing at the NIH

ClustalW on Helix
Clustal W is a general purpose multiple alignment program for DNA or proteins. The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. It is designed to be run interactively, or to assign options via the command line.

Clustalw can be run in two ways on the Helix Systems:

  1. Multiple Alignment Workshop. A web interface to Clustalw and other multiple seqence alignment programs.

  2. Command-line use on Helix
    For many or large sequences, it is probably better to use the command-line than a web interface. At the helix prompt, type 'clustalw'.

    Version

    Type 'clustalw' on Helix to see the current installed version.

    Sample session (user input in bold):

    helix% clustalw
    
     **************************************************************
     ******** CLUSTAL W (1.83) Multiple Sequence Alignments  ********
     **************************************************************
    
         1. Sequence Input From Disc
         2. Multiple Alignments
         3. Profile / Structure Alignments
         4. Phylogenetic trees
    
         S. Execute a system command
         H. HELP
         X. EXIT (leave program)
    
    Your choice: 1
    
    Sequences should all be in 1 file.
    
    7 formats accepted: 
    NBRF/PIR, EMBL/SwissProt, Pearson (Fasta), GDE, Clustal, GCG/MSF, RSF.
    
    Enter the name of the sequence file: seqs.inp
    
    Sequence format is Pearson
    Sequences assumed to be PROTEIN
    
    Sequence 1: chiins          110 aa
    Sequence 2: xenins          110 aa
    Sequence 3: humins          110 aa
    Sequence 4: monins          110 aa
    Sequence 5: dogins          110 aa
    Sequence 6: hamins          110 aa
    Sequence 7: bovins          110 aa
    Sequence 8: guiins          110 aa
    
     **************************************************************
     ******** CLUSTAL W (1.83) Multiple Sequence Alignments  ********
     **************************************************************
    
         1. Sequence Input From Disc
         2. Multiple Alignments
         3. Profile / Structure Alignments
         4. Phylogenetic trees
    
         S. Execute a system command
         H. HELP
         X. EXIT (leave program)
    
    Your choice: 2
    ****** MULTIPLE ALIGNMENT MENU ******
        1.  Do complete multiple alignment now (Slow/Accurate)
        2.  Produce guide tree file only
        3.  Do alignment using old guide tree file
    
        4.  Toggle Slow/Fast pairwise alignments = SLOW
    
        5.  Pairwise alignment parameters
        6.  Multiple alignment parameters
    
        7.  Reset gaps before alignment? = OFF
        8.  Toggle screen display          = ON
        9.  Output format options
    
        S.  Execute a system command
        H.  HELP
        or press [RETURN] to go back to main menu
    
    Your choice: 1
    
    Enter a name for the CLUSTAL output file  [seqs.aln]: myseqs.aln
    
    Enter name for new GUIDE TREE           file   [seqs.dnd]: 
    
    Start of Pairwise alignments
    Aligning...
    Sequences (1:2) Aligned. Score:  66
    Sequences (1:3) Aligned. Score:  63
    [...]
    Sequences (7:8) Aligned. Score:  61
    Guide tree        file created:   [seqs.dnd]
    Start of Multiple Alignment
    There are 7 groups
    Aligning...
    Group 1: Sequences:   2      Score:2138
    Group 2: Sequences:   2      Score:2373
    Group 3: Sequences:   4      Score:2157
    Group 4: Sequences:   5      Score:2146
    Group 5: Sequences:   6      Score:1971
    Group 6: Sequences:   2      Score:1972
    Group 7: Sequences:   8      Score:1739
    Alignment Score 13222
    
    Consensus length = 110
    CLUSTAL-Alignment file created  [seqs.aln]
    
    CLUSTAL W (1.83) multiple sequence alignment
    
    dogins          MALWMRLLPLLALLALWAPAPTRAFVNQHLCGSHLVEALYLVCGERGFFYTPKARREVED
    bovins          MALWTRLRPLLALLALWPPPPARAFVNQHLCGSHLVEALYLVCGERGFFYTPKARREVEG
    humins          BALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAED
    monins          BALWMRLLPLLALLALWGPDPVPAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAED
    hamins          MTLWMRLLPLLTLLVLWEPNPAQAFVNQHLCGSHLVEALYLVCGERGFFYTPKSRRGVED
    guiins          MALWMHLLTVLALLALWGPNTGQAFVSRHLCGSNLVETLYSVCQDDGFFYIPKDRRELED
    chiins          BALWIRSLPLLALLVFSGPGTSYAAANQHLCGSHLVEALYLVCGERGFFYSPKARRDVEQ
    xenins          BALWMQCLPLVLVLFFSTPNTE-ALVNQHLCGSHLVEALYLVCGDRGFFYYPKVKRDMEQ
                     :** :  .:: :* :  * .  * ..:*****:***:** ** : **** ** :*  * 
    
    dogins          LQVRDVELAGAPGEGGLQPLALEGALQKRGIVEQCCTSICSLYQLENYCN
    bovins          PQVGALELAGGPGAGG-----LEGPPQKRGIVEQCCASVCSLYQLENYCN
    humins          LQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN
    monins          PQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN
    hamins          PQVAQLELGGGPGADDLQTLALEVAQQKRGIVDQCCTSICSLYQLENYCN
    guiins          PQVEQTELGMGLGAGGLQPLALEMALQKRGIVDQCCTGTCTRHQLQSYCN
    chiins          PLVSS---PLRGEAGVLPFQQEEYEKVKRGIVEQCCHNTCSLYQLENYCN
    xenins          ALVSG---PQDNELDGMQLQPQEYQKMKRGIVEQCCHSTCSLFQLESYCN
                      *           .       *    *****:*** . *: .**:.***
    
    Press [RETURN] to continue or  X  to stop: X
    
     **************************************************************
     ******** CLUSTAL W (1.83) Multiple Sequence Alignments  ********
     **************************************************************
    
         1. Sequence Input From Disc
         2. Multiple Alignments
         3. Profile / Structure Alignments
         4. Phylogenetic trees
    
         S. Execute a system command
         H. HELP
         X. EXIT (leave program)
    
    Your choice: x
    

Documentation