Biowulf at the NIH
Meme & Mast on Biowulf
meme_mast Meme is designed to discover motifs (highly conserved regions) in groups of related DNA or protein sequences, and Mast will search sequence databases using motifs. Meme & Mast were developed at UCSD and Purdue. Meme/Mast website.

Meme is cpu-intensive for large numbers of sequences or long sequences. Short jobs are most easily run on Helix, but if larger datasets are used, a parallel run on Biowulf is appropriate.

How to run Meme on Biowulf

Your input database should consist of a file containing sequences in fasta format. In the example below, the file is 'mini-drosoph.seqs'. Determine the number of characters in the file using 'wc -c filename' to use for the parameter 'maxsize'. Set up a batch script along the lines of the one below:
------- this file is called meme.batch-----------------
#!/bin/csh
#PBS -N Meme
#PBS -m be
#PBS -j oe

setenv PATH /usr/local/mpich/bin:$PATH

cd /data/user/mydir/
mpirun -machinefile $PBS_NODEFILE -np $np  /usr/local/meme/bin/meme_p \
      mini-drosoph.seqs -dir /usr/local/meme/ -maxsize 500000 -text > mini-drosoph.meme
mast mini-drosoph.meme -text
Submit this script using
qsub -v np=32 -l nodes=16 meme.batch

Meme scales well, and large meme jobs (maxsize ~500,000) can be submitted on up to 128 processors.

Documentation

  1. Type 'meme' or 'mast' with no parameters on the command line to see a list of all available options and more information.
  2. Meme documentation at the SDSC website.
  3. Mast documentation at the SDSC website.