Biowulf at the NIH
RandFold on Biowulf

randfold_sm

   A randomization test for sequence secondary structure.


This is RandFold version 2.  The software computes the probability that, for a given RNA sequence, the Minimum Free Energy (MFE) of the secondary structure is different from a distribution of MFE computed with random sequences.

RandFold was developed by Eric Bonnet at the Bioinformatics & Evolutionary Genomics group, Universiteit Gent in Belgium.  A web page referencing the research may be found at bioinformatics.psb.ugent.be.

RandFold is not a parallel program.  Small numbers of Randfold jobs, or interactive Randfold runs, can be run on Helix or on the Biowulf interactive nodes.  If you have many Randfold jobs to run, the swarm utility is recommended.

RandFold Options

Syntax:
randfold <method> <file name> <number of randomizations>
Methods available:
-s  simple mononucleotide shuffling
-d  dinucleotide shuffling
-m  markov chain 1 shuffling

Output:
<sequence name> tab <mfe> tab <probability>
Example:
cel-let-7       -42.90  0.001000
Running an interactive Randfold job

Please note that only very short jobs (< 1 min) should be run on the Biowulf head node.   For interactive jobs running longer than one minute, please request an interactive batch node and run it there.

[user@biowulf ~]$ qsub -I -l nodes=1
qsub: waiting for job 2948120.biobos to start
qsub: job 2948120.biobos ready

[user@p3 ~]$ cd mydir
[user@p3 ~/mydir]$ /usr/local/randfold-2.0/bin/randfold -d let7.tfa 999
cel-let-7       -42.90  0.001000
[user@p3 ~/mydir]$ 
[user@p3 ~/mydir]$ 
[user@p3 ~/mydir]$ exit
logout

qsub: job 2948120.biobos completed
[user@biowulf ~] 

Submitting a single RandFold batch job

Single RandFold jobs would typically be submitted only for debugging purposes.

1. Create a script file which contains the RandFold commands as below:

--------- /data/username/randfold/testrun1/script ------------
#!/bin/csh -v
#PBS -N randfoldJobName
#PBS -m be
#PBS -k oe
cd /data/username/randfold/testrun1
/usr/local/randfold-2.0/bin/randfold -d let7.tfa 999 > ./let7.out
--------------------- end of script --------------------------

2. Now submit the script using the 'qsub' command, e.g.

[user@biowulf ~] qsub -l nodes=1 /data/username/randfold/testrun1/script
Submitting a swarm of RandFold jobs

The swarm program is a convenient way to submit large numbers of jobs all at once instead of manually submitting them one by one.

1. First create different directories for each RandFold run. Put the required input files under the created directories.

[user@biowulf ~] mkdir /data/username/randfold

2. For each directory, create a script file (named rfX.script in this example) which contains the RandFold command as below. Make sure this file is executable:

----------- /data/username/randfold/rf1.script -----------
cd /data/username/randfold
/usr/local/randfold-2.0/bin/randfold -d let7.tfa 999 > ./let7.out
----------------------------------------------------------

----------- /data/username/randfold/rf2.script -----------
cd /data/username/randfold
/usr/local/randfold-2.0/bin/randfold -d let8.tfa 999 > ./let8.out
----------------------------------------------------------

3. Now prepare the swarm command file (named cmdfile below), e.g.

-------------- cmdfile ----------------
/data/username/randfold/rf1.script
/data/username/randfold/rf2.script
.....
....
/data/username/randfold/rfX.script
----------- end of cmdfile ------------

4. Now submit the Swarm job.

[user@biowulf ~] swarm -f cmdfile -l nodes=1
More information

For more information, see the paper published in Bioinformatics:

Bonnet E., Wuyts J., Rouze P., Van de Peer Y.
Evidence that microRNA precursors, unlike other non-coding RNAs, have lower folding free energies than random sequences.
Bioinformatics. 2004 Nov 22;20(17):2911-7.
PMID: 15217813

A large collection of protein sequence databases is in /fdb/fastadb/.
Fasta-format databases and update status.