Scientific Supercomputing at the NIH

Blat on Helix
BLAT, is a DNA/Protein Sequence Analysis program written by Jim Kent at UCSC. It is designed to quickly find sequences of 95% and greater similarity of length 40 bases or more. It may miss more divergent or shorter sequence alignments. It will find perfect sequence matches of 33 bases, and sometimes find them down to 22 bases. BLAT on proteins finds sequences of 80% and greater similarity of length 20 amino acids or more. In practice DNA BLAT works well on primates, and protein BLAT on land vertebrates.

Typing 'blat' with no parameters on the command-line will cause brief help text to be printed.

For large numbers of sequences, BLAT is best run on Biowulf.

Version

Type 'blat' with no parameters to print out the current installed version of Blat, and a brief help page.

Sample session: (user input in bold)

BLAT with a single input sequence against human chromosome X.
helix% blat /fdb/genome/mouse-mar2006/chrX.fa ./nuc.fasta output.psl
Loaded 165556469 letters in 1 sequences
Searched 1350 bases in 1 sequences

helix%

BLAT databases

Any file with Fasta sequences can be a BLAT databases. A large selection of commonly used databases including the human and mouse genome data is maintained on the Helix Systems. [List and update status of Fasta-format databases]

Documentation