HIV molecular immunology database
The Hepitope tool searches for hopeful-epitopes or "Hepitopes". The tool tests for HLA alleles that are enriched in a set of individuals that react with a set of peptides. This can be used in conjunction with ELF (Epitope Location Finder) to scan the peptides for known epitopes in the database and anchor motifs to help identify epitopes within a larger peptide fragment. See below for details about the input and output of the program.
Input
To look for HLA enrichment in reactive individuals, we just do a 2 by 2 contingency table. We present four columns of data:
We then calculate the a one-sided Fishers exact test to determine if the distribution of those that carry the HLA allele among the reactive individuals is enriched for the given HLA more than would be expected by chance alone, in a fifth column. This test is conducted for every unique HLA found in the set of people that react with the peptide. The ELF program then could be used to rapidly search the database for known epitopes embedded in the peptide, and for anchor motifs for the epitope of interest. Links to the alignment of known epitopes can be quickly obtained, as well as the database entries regarding the epitopes.
The input requires two files:
The first contains every patient and their HLA alleles in the following format:
List of patient HLA alleles:
Patient1 A*0201 A*0201 B*5703 B*1701 Cw*0701 Cw*0705 Patient2 A*0201 A*0701 B*1202 B*0801 Cw*0701 Cw*0401 Patient3 A*1101 A*2403 B*0801 B*5801 Cw*0701 Cw*1501 Patient4 A*3002 A*3002 B*5802 B*5802 Cw*0602 Cw*0602
Where the allele can be written as a serotype (A2) or a genotype (A*0201), but if both are used then they will be treated separately in the analysis.
If an HLA type is unknown, it should be written as a single character, for example of the C's allele hadn't yet been determined in Patient 1, then the HLA could be written as:
Patient1 A*0201 A*0201 B*5703 B*1701 C C
A second file is needed that is a list of the reactive peptides followed by the patient ID of those that had a T-cell response to the peptide. Patients that aren't listed are assumed to be known not to respond:
List of reactive peptides:
Gag1 MGARASVLSGGELDRWEK Patient1 Gag2 SGGELDRWEKIRLRPGGK Patient2 Patient3 Gag3 EKIRLRPGGKKKYKLKHI Patient4
The output provides a recapitulation of the input files and the results for each peptide, including a list of the HLA's of each patient.
Peptide | Sequence | HLA Type | a | b | c | d | P | ||
---|---|---|---|---|---|---|---|---|---|
Gag1 | MGARASVLSGGELDRWEK | ||||||||
B*1701 | 1 | 0 | 0 | 3 | 0.25000000 | ||||
B*5703 | 1 | 0 | 0 | 3 | 0.25000000 | ||||
Cw*0705 | 1 | 0 | 0 | 3 | 0.25000000 | ||||
A*0201 | 1 | 0 | 1 | 2 | 0.50000000 | ||||
Cw*0701 | 1 | 0 | 2 | 1 | 0.75000000 | ||||
|
The columns a, b, c, and d are the four elements of the contingency table described above, and P is one-sided Fisher's exact p-value.
And so on for each peptide entered.