The Epitope Location Finder site performs various analyses based on a submitted HIV peptide sequences and HLA types of interest. It is an interface that is meant to be a workbench for experimentalists who have CTL reactivity data for peptides, who would like to quickly define probable epitopes within their peptide based on HLA anchor motifs or to find information about previously described epitopes stored in the immunology database. It can also help identify potentially missed CTL reactivities due to variations in the sequence strain selected as a basis for the peptides used to test the response. To get a feel for how the site works, run the "Sample Input".
The Paste protein sequence data entry box is the place to enter a sequence of interest for analysis, such as an immunologically active peptide. Sequences may be submitted in upper or lower case single-letter amino acid codes. Spaces and gaps will be removed by the software.
Choose HLA. This option consists of a pair of selection menus, one for selecting genotypes and another for serotypes. Any number of items can be chosen from each list; use a shift-click or control-click combination for multiple selections.
Show all known epitopes. When "Show all epitopes" is checked, ELF will find all known epitopes in our database within the bounds of your submitted protein regardless of their HLA. Any epitopes found which agree with your submitted HLAs will be flagged. If you uncheck this box, the program will find only known database epitopes whose serotype/genotype agrees with your submitted HLAs.
The first part of the output, reproduced here, presents 5 links. The output of each of these links is shown below.
View Genomic location of your peptide.
View HLAs associated with your submitted HLA.
View All anchor motifs used in this analysis.
View Potential "epitopes" ordered by HLAs.
View Database records for known CTL epitopes in this region, regardless of HLA.
The first link, Genomic location of your peptide, connects to a page with summary information about the location of the submitted protein as mapped onto the reference sequence genome. The table gives location information in tabular form, and the alignment shows your sequence aligned to the standard HIV.
Table of protein regions touched by query sequence. AA = amino acid, NA = nucleic acid. | |||||
CDS | AA position relative to query sequence start | AA position relative to polyprotein start in reference sequence | AA position relative to protein start in reference sequence | NA position relative to CDS start in reference sequence | NA position relative to reference sequence genome start |
Name of Protein | 1 -> 36 | 25 -> 60 | 25 -> 60 | 73 -> 180 | 414 -> 521 |
Alignment of the query sequence to the HIV reference sequence (Similarity % = 100.0)
Query PGGGQIVGGV YLLPRRGPRL GVRATRKTSE RSQPRG 36 :::::::::: :::::::::: :::::::::: :::::: ReferenceSeq PGGGQIVGGV YLLPRRGPRL GVRATRKTSE RSQPRG
The second link, HLAs associated with your submitted HLA, displays a table whose left column lists the HLAs you submitted. In the right column are their corresponding genotypes as defined in "The HLA dictionary 1999: a summary of HLA-A, -B, -C, -DRB1/3/4/5, -DQB1 alleles and their association with serologically defined HLA-A, -B, -C, -DR and -DQ antigens," G. M. Th. Schreuder, C. K. Hurley, S. G. E. Marsh, M. Lau, M. Maiers, C. Kollman, H. Noreen. Tissue Antigens 54: 409-437 (1999).
Submitted HLAs |
Associated HLAs |
A2, B7 |
A2, A2.1, A*0201, A*0202, A*0203, A*0204, A*0205, A*0206, A*0207, A*0208, A*0209, A*0210, A*0211, A*0212, A*0213, A*0214, A*0216, A*0217, A*0218, A*0220, A*0221, A*0222, A*0224, A*0225, A*0229, B7, B*07, B*0702, B*0703, B*0704, B*0705, B*0706, B*0707, B*0709, B*0711, |
The third link, the Anchor Motifs table, shows any known anchor residue motifs associated with the HLAs submitted. In the motifs column a "." character means any amino acid may occur at that position, while square brackets list the amino acids required at that position. For example in the A*0201 motifs below, a L or an M must occur at position 2. HLA anchor residue motifs used here are listed in The HLA Fact Book by S. Marsh, P. Parham, and L.Barber, published by the Academic Press, and MHC Ligands and Peptide Motifs, by H. G. Rammensee, J. Bachmann, and S. Stevanovic, Chapman & Hall publishers 1997. To find out more detailed information concerning HLA motifs, see the SYFPEITHI Web site.
HLA |
Anchor Residue Motifs |
A*0201 |
.[LM]......[VL] |
A*0201 |
.[LM].....[VL] |
A*0201 |
.[LM].......[VL] |
A*0202 |
.[L]......[LV] |
etc. |
etc. |
The fourth link, the Potential Epitopes table, shows that at position 12-20 of the submitted sequence "PGGGQIVGGVYLLPRRGPRLGVRATRKTSERSQPRG" there occurs an amino acid string which matches one of the A*0201 anchor residue motifs in the Anchor Motifs Table above.
pgggqivggvyLLPRRGPRLgvratrktsersqprg submitted peptide ||||||||| matches .L......L a A*0201 anchor motif
Note, these potential "epitopes" are purely theoretical; they may well have never been observed, and do not occur in our database. In the table they are ordered by HLA.
Position in query peptide | AA sequence | HLA | Anchor motif |
(12-20) | LLPRRGPRL | A*0201 |
.[LM]......[VL] |
(11-20) | YLLPRRGPRL | A*0201 |
.[LM].......[VL] |
(12-20) | LLPRRGPRL | A*0202 |
.[L]......[LV] |
(11-20) | YLLPRRGPRL | A*0202 |
.[L].......[LV] |
(12-20) | LLPRRGPRL | A*0204 |
.[L]......[L] |
etc. | etc. | etc. | etc. |
The last link, Database Records, connects to the CTL search tool. Database records for known CTL epitopes in the region of the query sequence are displayed, regardless of HLA.
This part of the ELF output displays epitopes from the HIV immunology database whose HLA agrees with your submitted HLA and whose range is within the bounds of your submitted protein sequence. These known epitopes are aligned to your query sequence. Each epitope is a clickable link that will display the entire database record for that epitope. The HLA is listed after each epitope and the button when clicked will use Epilign to align that epitope to all sequences in our most recent "Complete genome alignment".
Bold letters colored red indicate residues in known epitopes which differ from the equivalent residues in the query sequence. These residues may have bearing on the reactivity of the homologous peptide in your query.
PGGGQIVGGVYLLPRRGPRLGVRATRKTSERSQPRG YLLPRRGPRL A2 YLLPRRGPRL A2 YLLPTTGPRL A2 YLLPRRGPRL A2 YLLPRRGPRL A2 YLLPRRGPRL A2 YLLPRRGPRL A2 YLLPRRGPRL A2 YLLPSRGPKL A2 YLLPRRGPRL A2.1 LLPRRGPRL A2 LLPRRGPRL A2 GPRLGVRAT B7 GPRLGVRAT B7 GPRLGVRAT B7 GPRLGVRAT B7 GPRLGVRAT B7 GPRLGVRAT B7
These peptides have anchor residues that match one or more motifs associated with the submitted HLA, but are not found in our database. The alignment below presents them in genomic order. They are also available in HLA order in the clickable links earlier in this output.
PGGGQIVGGVYLLPRRGPRLGVRATRKTSERSQPRG GQIVGGVYL (A*0205 .[VLIMQ]......[L]) GQIVGGVYL (A*0214 .[VQL]......[LV]) GQIVGGVYLL (A*0205 .[VLIMQ].......[L]) GQIVGGVYLL (A*0214 .[VQL].......[LV]) QIVGGVYL (A*0205 .[VLIMQ].....[L]) QIVGGVYLL (A*0205 .[VLIMQ]......[L]) IVGGVYLL (A*0205 .[VLIMQ].....[L]) IVGGVYLL (A*0214 .[VQL].....[LV]) LPRRGPRL (B7 .[P].....[LF]) LPRRGPRL (B*0702 .[P].....[L]) LPRRGPRL (B*0703 .[P].....[L]) LPRRGPRL (B*0705 .[P].....[L])