Protein Informatics Group
  
Home People Research Publications News
         

PROSPECT Version 2.0:

Introduction
Installation
Quick Guide
Running
Prospect Manager
Input Formats
Templates
Parameters
Configurations
Outputs
References
FAQ
   

General Outputs


Energy Components

The output file provides the scores of each energy term, e.g.,

The raw score of the alignment: -1240.4 (-1242) Mutation = -1355.0; Singleton = -774.5; Pairwise = -290.8; InternalPair = -31.1; GapPenalty = 2211.0; SSFit = 0.0; AnchorMatch = -1000.0 (1-anchor); NMR-Backbone; = 0.0; PairConstrn = 0.0 (0-pair); PairVialate = 0.0.

The raw score of the alignment: total energy. The first number is calculated according to the final alignment; the number (integer) in brackets is the result of combinatorial search. One can estimate the accuracy of threading from the difference between the two numbers.

Mutation: mutation energy component of cores.

Singleton: singleton energy component of cores.

Pairwise: the component of pairwise energy between cores.

InternalPair: the component of pairwise energy within cores.

GapPenalty: gap penalty component together with mutation and singleton energies in the non-core regions.

SSFit: the fitness between the secondary structure prediction and the secondary structures assigned from the PROSPECT alignment, when using the secondary structure prediction as an input.

AnchorMatch: the contribution of matching the predefined alignments between the anchor residues on the target sequence and the positions on the template.

PairConstrn: the contribution of matching the predefined pairs between the residues on the target sequence.

The notations in sort file is: "C-ndx" for the normalized radius of gyration, "raw" for the total energy, "mut" for mutation energy component, "sing" for the singleton energy component, and "pair" for the total pairwise energy component.

Confidence Assessment of Threading Results

The following considerations can be used to assess the confidence level of the threading results:

  • Z-score. The higher z-score, the more reliable prediction. Instead of using z-score directly, more informative confidence index is used.  The confidence index is defined as the probability of a sequence-template pair with a certain z-score being a related protein pair. They are estimated by running a large number of threadings and by counting the number of true positives as a function of z-score.

    z-score interval Condfidence Index
    Category
    Similarity Level
    <6
    ~0
    unlikely
    unrelated
    6-8
    0.35
    low
    superfamily/fold
    8-10
    0.63
    medium
    superfamily/fold
    10-12
    0.85
    high
    family/superfamily
    12-20
    0.96
    very high
    family/superfamily
    >20
    1.00
    certain
    family



  • Normalized radius of gyration. It gives a good estimate for the compactness of the aligned portion in threading. If the value is above 3.0, the aligned portion is too uncompact, unless the template is a multi-domain protein.

  • Correlation between the secondary structure prediction and the secondary structures assigned from the PROSPECT alignment. It is known that secondary structure predictions have an accuracy of 70%. If the correlation is too low for a sizable target protein (more than 100 amino acids), the prediction may not be reliable.

Generating Structures from Alignments

The output file provides detailed information for users to analyse the alignments. The core alignment shows the aligned positions of cores, their secondary structure types ("h" for alpha-helix and "e" for beta-sheet), and their starting residue number in PDB. If the aligned residues are the same, it is marked by "|"; if they are similar, it is marked by ":"; if they are related, it is marked by ".". Residue range in the template gives the residue number and the extra character associated with the the residue number (if any) for the starting and ending positions of the aligned portion in the template. One can view the template portion in the range using a molecular graphics program.


-
Life Sciences Division  -  ORNL  -  Disclaimer  -  Webmaster