Protein Informatics Group
  
Home People Research Publications News
         

PROSPECT Version 2.0:

Introduction
Installation
Quick Guide
Running
Prospect Manager
Input Formats
Templates
Parameters
Configurations
Outputs
References
FAQ
   

A Quick Guide

This section gives a quick tutorial about the general usage of PROSPECT (after the program is properly installed). Through reading it, you should be able to start using the program in your research, and you can learn other functions gradually by checking the details in other parts of this manual.

We will use examples which can be found in the "demo" subdirectory where PROSPECT is installed. More demo commands can be found in the file "demo/readme". The expected results of running demo are available at "demo/results".


Enviroment and Requirements

Because prospect is an XML based application, it requires libxml2 to parse it's template files.  If this is not installed on your system, you can download it from http://www.libxml.org

Prospect also need to be able to find it's related files.  You will need to set the PROSPECT_PATH enviromental varible to point to the base of the installation, or pass it via the -prospect_path argument on the command line.

Basic Threading

Please go to "demo/1onc" directory. For the easiest threading, all you need is a sequence file, like the one "1onc.seq". The sequence file is very flexible. It can be either in a standard FASTA format (see example 1) or in a flexible format (see example 2). If you have a linux machine, then just type from the terminal:

prospect.LINUX -seqfile 1onc.seq

PROSPECT will automatically thread the sequence against 3200+ templates in the FSSP database. Depending on your computer and the size of the sequence, this job may take less than 5 minutes or more than an hour. After the job is finished, you will see the file:

1onc.seq.xml

To see these results in the order of raw score

sortProspect.LINUX 1onc.seq.xml

You can send these results to a file with the standard Unix redirection commands

sortProspect.LINUX 1onc.seq.xml > 1onc.seq.sort

It should be noted that although it is computationally least expensive, the threading result obtained this way may not be the best result.  It is generally recommended that you use the predictied secondary structure and the sequence profile information for the best result.  It is also known that sorting by "z-score" rather than raw score generally produces better result.

Using Secondary Structure Prediction in Threading (Recommended)

Using secondary structure prediction can often improve threading performance, especially the alignment accuracy. Here we show an example in the "demo/1bgc" directory. You will find a sequence file "1bgc.seq" and a secondary structure prediction file "1bgc.ss".
The secondary structure prediction can be obtained from the on-line server PHD developed by Burkhard Rost. You can also run our program prospect_ssp.

prospect.LINUX -phdfile 1bgc.ss

To create a secondary structure file, run the program

prospect_ssp.LINUX -seqfile 1bgc.seq -p > 1bgc.seq.ss

For better results, obtain a Blast checkpoint file to use as a profile:

get_chk_file 1bgc.seq

Then get a secondary structure based on this:

prospect_ssp.LINUX -chkfile 1bgc.seq.chk

Using Evolutionary Information in Threading (Recommended)

Threading can be much improved by utilizing the evolutionary information.  
prospect.LINUX -phdfile 1bgc.ss -freqfile 1bgc.freq


Calculating z-scores

The z-score is the threading score in standard deviation unit relative to the average score of the threading score distribution of random sequences with the same amino acid composition and sequence length as a query sequence. In practice, the theading score distribution is estimated by the repeated threadings between a template and a large number (>100) of randomly shuffled query sequences.  To calulate z-score, use "-reliab" option.  

prospect.LINUX -phdfile 1bgc.ss -freqfile 1bgc.freq -reliab

Warning: It takes much longer (approximately 100 times) than basic threading.  Use "-tfile" option to calculate z-scores for a small number of selected templates, or Prospect Manager.


Prospect Manager

If you want to do threading with the least amount of effort, try the Prospect Manager, a GUI interface for the prospect tool suite.


-
Life Sciences Division  -  ORNL  -  Disclaimer  -  Webmaster