Protein Informatics Group
  
Home People Research Publications News
         

PROSPECT Version 2.0:

Introduction
Installation
Quick Guide
Running
Prospect Manager
Input Formats
Templates
Parameters
Configurations
Outputs
References
FAQ
   

Template Format and Creating New Templates

PROSPECT users can choose either the SCOP Domain Library or the FSSP Chain Library as the template database for threading. PROSPECT uses FSSP as the default template library, since FSSP updates frequently following the PDB release. The FSSP library used in PROSPECT covers PDB structures released before MAY 2002. The SCOP domain library was constructed from the version 1.59 release (15 May 2002).

Each template is contained in a single XML file (this is one of the changes made between Prospect 1 and Prospect 2).  We attempted to include the library of all templates that would be nessacary for threading, but of course, 'power users' may wish to create their own templates and thread aginst them.
New templates can be generated from PDB files with the Prospect suite's  'make_template'. This process requires Psi-Blast, the NR database, Makemat, and the DSSP program.

Usage:

make_template -pdbfile <file> [-c 'Chain Letter']/[-r <start residue> <end residue>] [-n <template name >]

The following enviromental variables will need to be set:
  • BLASTPGP_EXE
  • BLASTPGP_DB
  • MAKEMAT_EXE
  • DSSP_EXE
It is also worth noting that a 'minimium' (one with only ATOM enteries) PDB file will not work with make_template. Make_template runs psi-blast on the sequence of the template, but because frequently the ATOMs enteries don't have all of the residues that are part of the chain make_template extracts the sequence from the SEQRES enteries.  
It is also a good idea to have a properly formated HEADER entery, so that make_template can get the ID for the template from the file (or it can be defined by the '-n' flag)

Using Custom Templates:

There are two methods that you can use.  First, custom templates can be used by threading with the -tempfile flag i.e.
threading.LINUX -phdfile myseq.ss -tempfile custom_template.xml
This method, however, is not suggested, because later tools, such as convertProspect will not be able to find the template.  Instead put the template in one of the directories that have been defined in $PROSPECT_PATH/data/parameters/template_paths
A good plan would be to put personal custom templates in $HOME/prospect_templates and put templates that you want to share with other users on your system in $PROSPECT_PATH/data/templates_local


Segment Formats:

The fssp portion of the XML file:

REM 1eso with 154 residues (58, 52) and 9 cores

REM F RS NUM SS ACC x-Cb y-Cb z-Cb x-Ca y-Ca z-Ca
RES 1 A 2 L 84 3.272 3.144 -9.294 2.420 1.910 -9.163
RES 4 S 3 E 69 3.293 -1.759 -6.554 4.056 -0.439 -6.662
RES 4 E 4 E 78 8.146 -0.153 -4.910 7.456 -1.438 -5.377
RES 4 K 5 E 162 9.042 -5.858 -4.437 8.815 -4.549 -3.671
RES 4 V 6 E 3 10.608 -3.092 0.509 11.042 -4.049 -0.610
RES 4 E 7 E 130 14.309 -7.618 -0.130 13.145 -6.962 0.609
RES 4 M 8 E 0 12.084 -7.323 5.161 13.400 -7.190 4.397
RES 4 N 9 E 43 17.660 -8.126 6.095 16.260 -8.570 6.535
RES 4 L 10 E 38 17.388 -10.690 10.731 17.001 -9.323 10.157
RES 4 V 10A E 9 18.814 -5.342 11.790 19.210 -6.815 11.942
RES 2 T 10B E 76 22.120 -8.432 15.178 21.395 -7.098 15.002
RES 1 S 10C T 58 24.149 -4.845 18.202 23.867 -4.830 16.703

Header: a general description of the template
REM 1eso with 154 residues (58, 52) and 9 cores
  | | | |   |
  | | | |   |
  | | | |   number of core secondary structures
  | | | |    
  | | | number of residues considered as core residues, i.e., with flag 4    
  | | |      
  | | number of residues as alpha-helix or beta-sheet, i.e., with flag 2-4      
  | |        
  | total number of residues in the template name        
  |          
  template name, same as PDB code          

REM: entry label (REM for remarks and RES for protein residues)

F: flag (1 for sequence only, 2 for an alpha-helix or beta-sheet residue, 3 for an alpha-helix or beta-sheet residue without C-beta coordinates, and 4 for a core residue)

RS: one-letter code of amino acid type (X for residues other than the standard 20 amino acid types).

NUM: residue number in PDB (including possible extra character associated with it).

SS: secondary structure type, using the same convention as in DSSP (e.g., H for alpha-helix and E for beta-sheet).

ACC: Solvent accessible surface area calculated by DSSP.

x-Cb, y-Cb, z-Cb: C-beta coordinates.

x-Ca, y-Ca, z-Ca: C-alpha coordinates.

-
Life Sciences Division  -  ORNL  -  Disclaimer  -  Webmaster