Massively Parallel Sequence Comparison on a 16,384 Processor MasPar MP-2
The Laboratory of Mathematical Biology in conjunction with the Frederick
Biomedical Supercomputer Center have been exploring the use of massively
parallel computation for sequence comparison. Given the large
quantities of protein and nucleic acid sequence information
produced in laboratories every year, it is apparent
that rapid methodologies are needed for comparing sequences against other
sequences in these rapidly growing databases. We have been using
MPSRCH on our 16384 processor MasPar MP-2 in this endeavor. The program
is a parallel implementations of the Smith-Waterman dynamic programming
algorithm for sequence comparison. We have seen rates as
high as 1.5 billion cell updates per second with searches using MPSRCH
against a non-redundant nucleic acid database consisting of over 800
million bases (this includes the reverse complement form of the database).
Such rapid comparisons increase researcher productivity and allow room
for varied experimentation.
MPsrch consists of a suite of eight programs for searching the protein and
nucleic acid databases. These programs are available through a Moasaic server
and consist of the following set:
- Protein sequence querie(s) versus a protein sequence database.
- Protein sequence querie(s) versus a protein sequence database. Uses affine gap penalties.
- Nucleic acid sequence queries(s) versus a nucleic acid sequence database.
- Nucleic acid sequence queries(s) versus a nucleic acid sequence database. Uses affine gap penalties.
- Nucleic acid sequence queries(s) versus a protein sequence database that has been back translated into
nucleic acid sequences.
- Nucleic acid sequence querie(s) versus a protein sequence database that has been back translated into nucleic acid sequences. Uses affine gap penalities.
- Protein sequence querie(s) that have been backtranslated into nucleic acid sequences versus a protein sequence database.
- Protein sequence querie(s) that have been backtranslated into nucleic acid sequences versus a protein sequence database. Uses affine gap penalties.
At the current time non-NIH server access requires a password.
To activate server CLICK HERE!
Please contact: Dr. Bruce Shapiro
Laboratory of Mathematical Biology
National Cancer Institute
Frederick Cancer Research and Development Center
Building 469, Room 150
Frederick, Maryland 21702
email: bshapiro@ncifcrf.gov
301-846-5535
or
Gary Smythers
NCI-FCRDC
Advanced Scientific Computing Laboratory.
PO Box B, Bldg. 430
Frederick Md. 21702-1201
gws@ncifcrf.gov
301-846-5779