Scientific Supercomputing at the NIH

WU-Blast on Helix
WU-Blast performs sensitive, selective and rapid similarity searches of protein and nucleotide sequence databases. WU-BLAST 2.0 builds upon WU-BLAST-1.4, which in turn was based on the public domain NCBI-BLAST version 1.4. It was developed by Warren Gish at Washington University. [WU-Blast website].

WU-Blast with large numbers of sequences (>100) may be most suitable for the Biowulf cluster. Contact the Helix Systems staff (staff@helix.nih.gov) if you have questions about running Wublast.

The Wu-Blast family of programs includes:

Update status of WU-Blast databases on Helix

Version

The WU-BLAST version is printed at the top of every Wu-Blast output.

Sample session: (user input in bold):

helix% wublast

 WU-BLAST searches for sequences similar to a query sequence. The query and the
 database searched can be either peptide or nucleic acid in any combination.

Search with what query sequence? cram_craab.fas

Your query sequence is a protein sequence. Available programs are:

    blastp - protein query sequence against protein database
    tblastn - protein query sequence against a nucleotide database
          translated in all 6 reading frames

Which program do you want to run (blastp)? tblastn

The following nucleotide databases are available:
(or enter your own database with full pathname)
    1) nt - all nonredundant Genbank+EMBL+DDBJ+PDB (no EST, STS, GSS or HTG)
    2) est_human - nonredundant Genbank+EMBL+DDBJ EST human sequences
    3) est_mouse - nonredundant Genbank+EMBL+DDBJ EST mouse sequences
    4) pdb.nt - from the 3-dimensional structures 
    5) ecoli.nt - ecoli genomic sequences
    6) mito.nt - mitochondrial sequences
    7) yeast.nt - yeast (Saccharomyces cerevisiae) genomic sequences
    8) drosoph.nt - drosophila sequences
    9) hs_genome - human genome assembly (Build 35, May 2004)
   10) hs_genome.rna - human genome RNA (Build 35, May 2004)
   11) mouse_genome - mouse genome assembly (Build 33, June 2004)
   12) mouse_genome.rna - mouse genome RNA (Build 33, June 2004)
   13) ref.human.rna - Refseq Human RNA
   14) ref.mouse.rna - RefSeq Mouse RNA

which database (1)? 1

Use NCBI-Blast parameters? [n]: 

Any additional WUBlast parameters (e.g. -E=1.0 -V=10 -B=10): 

What should I call the output file (cram_craab.tblastn) ? 
-----------------------------------------------------------------------------
Running command as follows:
/usr/local/wublast/x86_64/tblastn /fdb/wublastdb/nt cram_craab.fas  -o cram_craab.tblastn

WARNING:  Use of the hspsepSmax parameter should be considered with long
          database sequences, to improve the biological relevance of the HSP
          groups that are assembled and to improve the statistical
          discrimination of these groups from random background.

WARNING:  hspmax=1000 was exceeded by 407 of the database sequences, causing
          the associated cutoff score, S2, to be transiently set as high as
          37.
helix% 

Sample output

TBLASTN 2.0MP-WashU [04-May-2006] [linux26-x64-I32LPF64 2006-05-10T17:22:28]

Copyright (C) 1996-2006 Washington University, Saint Louis, Missouri USA.
All Rights Reserved.

Reference:  Gish, W. (1996-2006) http://blast.wustl.edu

Query=  CRAM_CRAAB, 46 aa.
        (46 letters)

Database:  All Non-redundant GenBank+EMBL+DDBJ+PDB (but no EST, STS, GSS, HTG)
           built on Fri Mar 14 20:58:19 2008 
           6,546,745 sequences; 23,125,808,238 total letters.

Searching....10....20....30....40....50....60....70....80....90....100% done

                                                                     Smallest
                                                                       Sum
                                                     Reading  High  Probability
Sequences producing High-scoring Segment Pairs:        Frame Score  P(N)      N

emb|X81709.1|TGTHI14 T.gesneriana Thi1-4 mRNA for thionin... +2   149  8.4e-08   1
dbj|AB072338.1| Avena sativa mRNA for leaf thionin Asthi1... +3   140  6.3e-07   1
dbj|AB072339.1| Avena sativa mRNA for leaf thionin Asthi2... +2   134  2.8e-06   1
dbj|AB072340.1| Avena sativa mRNA for leaf thionin Asthi3... +3   129  1.1e-05   1
[...]
>emb|X81709.1|TGTHI14 T.gesneriana Thi1-4 mRNA for thionin class 1
        Length = 535

 Score = 149 (57.5 bits), Expect = 8.4e-08, P = 8.4e-08
 Identities = 25/43 (58%), Positives = 29/43 (67%), Frame = +2

Query:     2 TCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDY 44
             +CCPS  AR+ +NVCR PGTP  +CA   GC II G  CP DY
Sbjct:    41 SCCPSTAARNCYNVCRFPGTPRPVCAATCGCKIITGTKCPPDY 169


>dbj|AB072338.1| Avena sativa mRNA for leaf thionin Asthi1, complete cds
        Length = 677

 Score = 140 (54.3 bits), Expect = 6.3e-07, P = 6.3e-07
 Identities = 24/43 (55%), Positives = 30/43 (69%), Frame = +3

Query:     2 TCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDY 44
             +CC  I+AR+ +NVCR+PGTP  +CAT   C II G  CP DY
Sbjct:   111 SCCKDIMARNCYNVCRIPGTPRPVCATTCRCKIISGNKCPKDY 239
[...]

Documentation