High-Performance Computing at the NIH

RSS Feed
Phred/Phrap/Consed on Helix

[Programs location] [Phred/Cross Match Sample] [Phrap/Phred Sample] [Crossmatch Sample] [Consed Sample] [Documentation]

Phred software reads DNA sequencing trace files, calls bases, and assigns a quality value to each called base. The phred quality values have been thoroughly tested for both accuracy and power to discriminate between correct and incorrect base-calls. Phred can use the quality values to perform sequence trimming.

Phrap is a program for assembling shotgun DNA sequence data. Cross_match is a general purpose utility for comparing any two DNA sequence sets using a 'banded' version of swat. Swat is a program for searching one or more DNA or protein query sequences, or a query profile, against a sequence database, using an efficient implementation of the Smith-Waterman or Needleman-Wunsch algorithms with linear (affine) gap penalties.

Consed/Autofinish is a tool for viewing, editing, and finishing sequence assemblies created with phrap. Finishing capabilities include allowing the user to pick primers and templates, suggesting additional sequencing reactions to perform, and facilitating checking the accuracy of the assembly using digest and forward/reverse pair information.

Phred/Phrap/Consed and associated tools have been developed by Phil Green, David Gordon and Brent Ewing at the University of Washington. For detailed information, see http://www.phrap.org/phredphrapconsed.html

Initialization
Available versions of Phred, Phrap, Cross_match, Swat, and Consed can be seen by using the modules commands, as in the example below.

helix% module avail phred

------------------ /usr/local/Modules/3.2.9/modulefiles ------------------
phred/020425 phred/071220

helix% module avail phrap             

------------------ /usr/local/Modules/3.2.9/modulefiles ------------------
phrap/1.080721 phrap/1.090518

helix% module avail consed

------------------ /usr/local/Modules/3.2.9/modulefiles ------------------
consed/23.0

helix% module load phred/071220 phrap/1.090518 consed/23.0

helix% module list
Currently Loaded Modulefiles:
  1) phred/071220     2) phrap/1.090518   3) consed/23.0

The 'module load' command will set up the PATHs appropriately for your shell, and will also set required environment variables such as PHREDPAR.

Phred/Cross Match Sample Session

A set of example files are available in /usr/local/apps/consed/23.0/examples.

% cp -r /usr/local/apps/consed/23.0/examples/* /data/$USER/phred_example
% cd /data/$USER/phred_example
% cp -r standard/ test/
% cd test

Delete all the files in phd_dir and edit_dir:

% rm phd_dir/* edit_dir/*
% cd edit_dir

Run phredPhrap by typing

% phredPhrap

A bunch of files appear in this directory. Please note, if you intend to use consed, you 'MUST' use this 'phredPhrap' perl script. Failure to use this script will result in many consed features not working correctly, including consed's autofinish function, user-defined consensus tags, tagging ALU and other repeats, and tagging vector sequence. Use the phredPhrap perl script.

If you want to call bases from the chromat files in subdirectory "chromat_dir", use phrap to assemble the contigs, and run consed to edit/examine the contigs. In this case you must ask phred to create "phd" output files, which are required by consed:

% cd /home/user/consed/sample1/test
% phred -id chromat_dir -pd phd_dir

This causes phred to read the chromat files in "chromat_dir" and write the "phd" files to "phd_dir". Next it makes FASTA files from the "phd" files by running the phd2fasta program:

% phd2fasta -id phd_dir -os seqs_fasta -oq seqs_fasta.screen.qual

Subsequently it screens out the vector in the sequences in "seqs_fasta" using cross_match:

% cross_match seqs_fasta vector.seq -minmatch 12 -minscore 20 -screen > screen.out

which generates the screened sequence file "seqs_fasta.screen"

Phrap/Phred Sample Session

Follow above 'Phred Sample Session'.

Runs phrap to perform the sequence assembly as follows:

% phrap seqs_fasta.screen -new_ace > phrap.out

As another example, again you want to process the chromat files in subdirectory "chromat_dir", but now you want phred to write the base calls to a FASTA file named "seqs_fasta" and the base quality values to "seqs_fasta.qual". In this case you run phred with the options:

% phred -id chromat_dir -sa seqs_fasta -qa seqs_fasta.qual

Consed Sample session

The following demo can be found in the Consed 23.0 documentation.

You need an X-Windows connection to Helix.

Copy the sample files into your own area if you have not already done so:

% cp -r /usr/local/consed/23.0/examples /data/$USER/phred_example

ADDING SOLEXA READS

% cd /data/$USER/mydir
% fasta2Ace.perl ref.fa
% addSolexaReads.perl ref.ace bustard_files.fof ref.fa

ADDING 454 READS

% cd /home/user/consed/sample1/align454reads/edit_dir
% fasta2Ace.perl reference.fa

Bring up Consed and double click on 'reference.ace.1', make sure your X-windows application is started:

% consed

Then double click on contig "myreference" to bring up the Aligned Reads Window. Scroll around a little and right click on a read or two to see the trace. The Aligned Reads window looks like this:

Close all windows to exit consed

454 READS (NEWBLER ASSEMBLY)

The Newbler Assembler and Consed work together. To see a Newbler assembly:

% cd /data/user/consed/sampleTEST/454_newbler/edit_dir
% consed

Double click on "454Contigs.ace.1" on top.

Double click on "contig00001" to bring up the Aligned Reads Window.

Using the thumb at the bottom, scroll from the far left of the contig all the way to the far right to get an idea of the assembly. (It is a very small one.)

In the Aligned Reads Window, scroll to position 230 and right click on the T in read EBE03TV01CI9BG.1-240 (which is the top read). A bunch of selections are displayed. Select 'Display traces for all reads':

RESTRICTION DIGEST

The sample file 'standard.fasta.screen.ace.1' is under '/home/user/consed/sample1/standard/edit_dir'.

Follow step 174 on the Consed 23.0 documentation

ADD NEW READS

% cd /home/user/consed/sample1/standard/edit_dir
% cp ../chromats_to_add/* ../chromat_dir

Restart consed again and use the original ace file standard.fasta.screen.ace.1. If it asks if you want to apply edits, just say 'no'.

On the Main Window, click on the Add New Reads button. There will appear a list of files ending with .fof. These are files that contain lists of chromatograms. Double click on 'reads_to_add.fof' (Accept the defaults for the other options in this window.) There should be lots of progress output in the xterm from which you started Consed. When it completes, there will be a Reads Added Window popup with a report of which reads were added. In this case, it should say that 9 reads were successfully added and list them.

ASSEMBLY VIEW

Consed can show you a bird's eye view of the Assembly using forward/reverse pair information, sequence match information, read depth, etc. We have a test database which shows its features.

Exit consed and type:

% cd /home/user/consed/sample1/assembly_view/edit_dir
% consed

Double click on "assembly_view.fasta.screen.ace.1"

In the Consed Main Window, click on the button "Assembly View" which is near the upper left corner of the window. The Assembly View window looks like this:

RUNNING CROSSMATCH FOR SEQUENCE MATCHES

Click on 'What to show', 'Sequence Matches'. The 'Which Sequence Matches to Show In Assembly View' window comes up. Click on the 'Run Crossmatch' button. Watch the action in the xterm. There should be several pages worth of output from crossmatch that scrolls by in the xterm. 3 orange pairs of curvy lines will appear in the Assembly View Window which is the same as you saw in the above window.

 

Documentation

Consed 23.0 documentation

http://www.phrap.org/phredphrapconsed.html

Disclaimer | Privacy | Accessibility | CIT | NIH | DHHS | USA.gov