Biowulf at the NIH
Plink on Biowulf

Plink is a whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.

The focus of PLINK is purely on analysis of genotype/phenotype data, so there is no support for steps prior to this (e.g. study design and planning, generating genotype calls from raw data). Through integration with gPLINK and Haploview, there is some support for the subsequent visualization, annotation and storage of results.

PLINK (one syllable) is being developed by Shaun Purcell at the Center for Human Genetic Research (CHGR), Massachusetts General Hospital (MGH), and the Broad Institute of Harvard & MIT, with the support of others.

PLINK is not a parallel program. Single PLINK jobs should be run interactively on the Biowulf interactive nodes or Helix. If you have multiple PLINK jobs to run, the swarm utility is recommended.

Submitting a swarm of Plink jobs

The swarm program is a convenient way to submit large numbers of jobs all at once instead of manually submitting them one by one.

1. First create different directories for each plink run. Put all the required input files under the created directories (for example below, input files are test1.map and test1.ped.

2. For each directory, create a script file (named script1 in this example) which contains the plink commands as below. Note, --noweb option is needed so plink won't search for internet; also make sure this file is executable:

-----------/data/user/plink/t1/script1 ----------
cd /data/maoj/plink/t1
plink --noweb --file test1
plink --noweb --file test1 --freq
plink --noweb --file test1 --assoc
plink --noweb --file test1 --make-bed

-------------------------------------------------

3. Now prepare the swarm command file (named cmdfile below), e.g.

------- cmdfile -------------
/data/user/plink/t1/script1
/data/user/plink/t2/script2

.....
....
/data/user/plink/tX/scriptX
---- end of cmdfile ---------

4. Now submit the Swarm job. Note:you must request nodes with property x86-64.

biowulf% swarm -f cmdfile -l nodes=1:x86-64
Submit a single Plink batch job

Single plink jobs would typically be submitted only for debugging purposes.

1. Create a script file which contains the Plink commands as below:

---------- /data/user/plink/run1/script --------------
#!/bin/csh -v
#PBS -N plinkJobName
#PBS -m be
#PBS -k oe
cd /data/maoj/plink/t1
plink --noweb --file test1
plink --noweb --file test1 --freq
plink --noweb --file test1 --assoc
plink --noweb --file test1 --make-bed
----------------- end of script ----------------------

2. Now submit the script using the 'qsub' command, e.g.

qsub -l nodes=1:x86-64 /data/user/plink/run1/script
Documentation

http://pngu.mgh.harvard.edu/~purcell/plink/