Description of BLAST Services
PubMed Entrez BLAST OMIM Taxonomy Structure

  Nucleotide BLAST

Nucleotide BLAST searches allow one to input nucleotide sequences and compare these against other nucleotides.

Standard nucleotide-nucleotide BLAST - Takes nucleotides sequences in FASTA format, GenBank Accession numbers or GI numbers and compares them against the NCBI nucleotide databases.

MEGABLAST - This program uses a "greedy algorithm" (Webb Miller et al.) for nucleotide sequence alignment searches and concatenates many queries to save time spent scanning the database. It is optimized for aligning sequences that differ slightly and is up to 10 times faster than more common sequence similarity programs. It can be used to swiftly compare two large sets of sequences against each other.

Search for short, nearly exact sequences - This search is similar to the standard nucleotide-nucleotide BLAST with the parameters set automatically to optimize for searching with short sequences. A short query is more likely to occur by chance in the database. Therefore increasing the Expect value threshold, and also lowering the word size is often necessary before results can be returned. Low Complexity filtering has also been removed since this filters out larger percentage of a short sequence, resulting in little or no query sequence remaining.

  Protein BLAST

Protein BLAST allows one to input protein sequences and compare these against other protein sequences.

Standard protein-protein BLAST - Takes protein sequences in FASTA format, GenBank Accession numbers or GI numbers and compares them against the NCBI protein databases.

PSI-BLAST - Position Specific Iterated BLAST uses an iterative search in which sequences found in one round of searching are used to build a score model for the next round of searching. Highly conserved positions receive high scores and weakly conserved positions receive scores near zero. The profile is used to perform a second (etc.) BLAST search and the results of each "iteration" used to refine the profile. This iterative searching strategy results in increased sensitivity. More details.

PHI-BLAST - Pattern Hit Initiated BLAST combines matching of regular expression pattern with a Position Specific iterative protein search. PHI-BLAST can locate other protein sequences which both contain the regular expression pattern and are homologous to a query protein sequence. More details.

Search for short, nearly exact sequences - This search is similar to the standard protein-protein BLAST with the parameters set automatically to optimize for searching with short sequences. A short query is more likely to occur by chance in the database. Therefore increasing the Expect value threshold, and also lowering the word size is often necessary before results can be returned. Low Complexity filtering has also been removed since this filters out larger percentage of a short sequence, resulting in little or no query sequence remaining. Also for short protein sequence searches the Matrix is changed to PAM-30 which is better suited to finding short regions of high similarity.

  Translating BLAST

Translating BLAST searches translate either query sequences or databases from nucleotides to proteins so that protein - nucleotide sequences can be performed.

Translated query - Protein db [blastx] - Converts a nucleotide query sequence into protein sequences in all 6 reading frames. The translated protein products are then compared against the NCBI protein databases

Protein query - Translated db [tblastn] - Takes a protein query sequence and compares it against an NCBI nucleotide database which has been translated in all six reading frames.

Translated query - Translated db [tblastx] - Converts a nucleotide query sequence into protein sequences in all 6 reading frames and then compares this to an NCBI nucleotide database which has been translated in all six reading frames.

  CD-Search

CD-Search compares a protein sequence against the Conserved Domain Database with the RPS-BLAST program. This database currently contains domains derived from two popular collections, Smart and Pfam, plus contributions from colleagues at NCBI. This allows known functional and structural domains to be identified on protein query sequences.

  Pairwise BLAST

Pairwise BLAST performs a comparison between two sequences using the BLAST algorithm. Not that the program considers a "Sequence 1" to be the Query sequence and "Sequence 2" to be the Subject sequence. There are the following program options:

blastn - for nucleotide - nucleotide comparisons

blastp - for protein - protein comparisons

tblastn - compares the protein "Sequence 1" against the nucleotide "Sequence 2" which has been translated in all six reading frames

blastx - compares the nucleotide "Sequence 1" against the protein "Sequence 2"

tblastx - compares nucleotide "Sequence 1" translated in all six reading frames against the nucleotide "Sequence 2" translated in all six reading frames.

  Specialized BLAST pages

Specialized BLAST pages allow you to search databases related to specific organisms or fields of research.

Human Genome - Allows one to compare sequences against the completed Human Genome contigs from the Genome Sequencing Centers. The data can be searched as separate chromosome or all chromosomes at one time. Helpful for determining the possible chromosomal location of a sequence.

Finished and Unfinished Microbial Genomes - Allows comparisons against finished and unfinished microbial genomes from contributing Genome Centers.

P. falciparum - Separate webpage for BLAST searching all Plasmodium falciparum sequences at NCBI. Databases include protein and nucleotide sequences, including separate databases for ESTs, STSs, GSSs, and High Throughput Genomic sequences.

VecScreen - BLAST-based detection of vector contamination - VecScreen can be used for quickly identifying segments of a nucleic acid sequence that may be of vector origin. NCBI developed VecScreen to combat the problem of vector contamination in public sequence databases

IgBLAST - Analysis of immunoglobulin sequences in GenBank - Facilitates analysis of immunoglobulin sequences in GenBank. Reports the three germline V genes, two D* and two J* genes that show the closest match to the query sequence. Annotates the immunoglobulin domains (FWR1 through FWR3) according to Kabat et al . Matches the returned hits from the nr database to the closest germline V genes, thus making it easier to identify related sequences.

  Retrieve Results with an RID

With the QBLAST system all searches are assigned an RID (Request ID) number. You can use the Retrieve Results with an RID service to enter an RID and have the results displayed without having to rerun the search. You can also change any of the Format options and this will be applied to your search results. This will allow you to apply different formats to the same results for comparison. The BLAST result corresponding to an RID is stored on the BLAST servers for up to 24 hours.

  OLD BLAST

The OLD BLAST pages, as they existed before January 2001, are temporarily preserved for convenience.

Disclaimer Privacy statement

Revised January 21, 2000