HIV molecular immunology database
The HIV Molecular Immunology Database contains 3 search pages:
Information below explains what each database contains and the meaning of terms used in the search interfaces.
T cell epitopes are categorized into cytotoxic T lymphocytes (CTL/CD8+) and helper T lymphocytes (T-helper/CD4+). The database organization for CTL/CD8+ and T-helper/CD4+ is identical, so they are described together.
The T-cell databases include tables, maps, and associated references of HIV-specific T-cell epitopes arranged sequentially according to the location of the proteins in the HIV-1 genome. We attempted to make this section as comprehensive as possible, requiring that the epitope be contained within a defined region of a maximum of 30 amino acids, but not that the optimal boundaries be defined. Studies that were based on the analysis of whole proteins are described at the end of each protein section. The same epitope can have multiple entries, and each entry represents a single publication in this section of the database. T-cell protein reactions with poorly-defined epitopes are listed at the end of each protein section.
For a concise listing of the best-defined CTL epitopes, see the Best-defined Epitope Summary Table, or read the tables in the review articles by Nicole Frahm, Christian Brander, and colleagues.
Recent studies utilize multiple functions attributed to T cells to define responses, and the simple distinctions of cytotoxic T cells and helper T cells have become blurred as more is learned about the range of responses triggered in CD4 and CD8 positive T cells responding to antigenic stimulus. When adding the most recent studies, we have tried to place T-cell responses in a reasonable manner into our traditional CTL and helper T-cell sections, and to specify the assay used to measure the response in each study.
T cell epitopes are arranged by protein position. The table entries are sorted in a nested way: first by protein, then by HXB2 start location within the protein, then by HXB2 end location, and finally by MHC/HLA presenting molecule. Epitopes for which the HXB2 location is unknown appear at the end of the listing of the protein in which they are located.
Each T cell epitope has a multi-part basic entry:
Record number A unique number assigned by the database, in approximate order of entry. Please refer to this number if you have any comments or questions about an entry.
HXB2 Location The viral strain HXB2 (GenBank Accession Number K03455) is used as a reference strain throughout this publication. The position of the defined epitope location relative to the sequence of the HXB2 protein is indicated. The numbering in this table corresponds to the protein maps. Because of HIV-1 variation the epitope may not actually be present in HXB2, rather the position in HXB2 indicates the position aligned to the epitope. HXB2 was selected as the reference strain because so many studies use HXB2, and because crystal structures for HXB2-related proteins are available. The precise positions of an epitope on the HXB2 reference strain can be readily obtained using the Sequence Locator Tool.
Author Location The amino acid positions of the epitope boundaries and the reference sequence are listed as given in the primary publication. Frequently, these positions as published are imprecise, and do not truly correspond to the numbering of the sequence, but they provide a reasonable guide to the peptide's approximate location in the protein. Also, in many cases the reference sequence identification was not provided, and in such cases it is not possible to use these numbers to specify precise locations.
Subtype The subtype under study, generally not specified for B subtype.
Epitope Sequence The amino acid sequence of the epitope of interest as defined in the reference, based on the reference strain used in the study defining the epitope. On occasions when only the position numbers and not the actual peptide sequence was specified in the original publication, we tried to fill in the peptide sequence based on the position numbers and reference strain. If the sequences were numbered inaccurately by the primary authors, or if we made a mistake in this process, we may have misrepresented the binding site's amino acid sequence. Because of this uncertainty, epitopes that were not explicitly written in the primary publication, that we determined by looking up the reference strain and the numbered location, are followed by a question mark in the table.
Epitope Name If the epitope has a name attributed by the publication, it is recorded here, e.g. "SL9".
Species (MHC/HLA) The species responding and MHC or HLA specificity of the epitope.
Immunogen The original stimulus of the T-cell response. Often this is an HIV-1 infection. If a vaccine was used as the original antigenic stimulation, not a natural infection, this is noted on a separate line, and additional information about the vaccine antigen is provided as available.
Keywords Keywords are a searchable field for the web interface that is included in the T-cell sections of the printed version to help identify entries of particular interest.
Reference The primary reference (sometimes two or more directly related studies are included). Some of the earlier references include notes.
Notes Brief comments explain the context in which the epitope was studied and what was learned about the epitope in a given study.
All HIV T cell epitopes mapped to within a region of 21 amino acids or less are indicated on the HIV protein epitope maps. The location and MHC/HLA restriction elements of CTL epitopes are indicated on protein sequences of HXB2. These maps are meant to provide the relative location of defined epitopes on a given protein, but the HXB2 sequence may not actually carry the epitope of interest, as it may vary relative to the sequence for which the epitope was defined. Epitopes with identical boundaries and MHC/HLA fields are included in the maps only once. If one laboratory determines MHC/HLA presenting molecules at the serotype level (example: A2) and another at the genotype level (example: A*0201) both will be included in the map. MHC specificities are indicative of the host species; when no MHC presenting molecule is defined, the host species is noted.
Epitope alignments can be generated using the epitope search tools. All epitopes are aligned to the HXB2 sequence, with the sequence used to define the epitope indicated directly above it. Sequences are sorted by their subtype and country of origin.
The master alignment files from which the epitope alignments were
created are available at our sequence web site (HIV web alignments).
The alignments were modified in some cases to optimize the alignment
relative to the defined epitope and minimize insertions and deletions;
epitope alignments are generated by anchoring on the C-terminal
residue. A dash indicates identity to the consensus sequence, and a
period indicates an insertion made to maintain the alignment. Stop
codons are indicated with a $
, and frameshifts by a
#
, or ambiguous codons (nucleotide was r
,
y
, or n
) by an x
; they are
inserted to maintain the alignments. In consensus sequences an upper
case letter indicates the amino acid was present in all sequences, a
lower case letter indicates the amino acid was present in most
sequences in a given position, and a question mark indicates two or
more amino acids were represented with equal frequency.
The antibody database summarizes HIV-specific antibodies (Abs) arranged sequentially according to the location of their binding domain, organized by protein. We attempted to make this section as comprehensive as possible. For the monoclonal (MAbs) capable of binding to linear peptides, we require that the binding site be contained within a region of 30 or so amino acids to define the epitope, but not that the precise boundaries be defined. MAbs that do not bind to defined linear peptides are grouped by category at the end of each protein. Antibody categories, for example CD4 binding site (CD4BS) antibodies, are also noted in the index at the beginning of this section. Studies of polyclonal Ab responses are also included. Responses that are just characterized by binding to a protein, with no known specific binding site, are listed at the end of each protein.
The table entries are sorted in a nested way: first by protein, then by HXB2 start location within the protein, then by HXB2 end location, then by antibody type, and finally by antibody name. Abs that bind to conformational epitopes or with unknown epitopes are listed at the end of each protein section.
Each MAb or polyclonal response has a multi-part basic entry:
Record number A unique number assigned by the database, in approximate order of entry. Please refer to this number if you have any comments or questions about an entry.
MAb ID The name of the monoclonal antibody with synonyms in parentheses. MAbs often have several names. For example, punctuation can be lost and names are often shortened (M-70 in one paper can be M70 in another). Polyclonal responses are listed as "polyclonal" in this field.
HXB2 Location Position of the Ab binding site relative to the viral strain HXB2 (GenBank Accession Number K03455), which is used as a reference strain throughout this publication. The numbering in this table corresponds to the protein maps. Because of HIV-1 variation the epitope may not actually be present in HXB2, rather the position in HXB2 indicates the position aligned to the epitope. HXB2 was selected as the reference strain because so many studies use HXB2, and because crystal structures for HXB2-related proteins are often available. The precise positions of an epitope on the HXB2 reference strain can be readily obtained using the interactive position locator at our Sequence Locator Tool.
Author Location The amino acid positions of the epitope boundaries and the reference sequence used to define the epitope are listed as given in the primary publication. Frequently, these positions as published are imprecise, and do not truly correspond to the numbering of the sequence, but they provide a reasonable guide to the peptide's approximate location in the protein. Also, in many cases, position numbers were provided but the reference sequence identification was not. Because of HIV-1's variability, position numbers require a reference strain to be meaningful. Binding sites that cannot be defined through peptide binding or interference studies are labeled as discontinuous. The approximate location on the protein, sequence number, and reference sequence are listed.
Sequence The amino acid sequence of the binding region of interest, based on the reference strain used in the study defining the binding site. On occasions when only the position numbers and not the actual peptide sequence was specified in the original publication, we tried to fill in the peptide sequence based on the position numbers and reference strain. If the sequences were numbered inaccurately by the primary authors, or if we made a mistake in this process, we may have misrepresented the binding site's amino acid sequence. Because of this uncertainty, epitopes that were not explicitly written in the primary publication, that we determined by looking up the reference strain and the numbered location, are followed by a question mark in the table.
Neutralizing L: neutralizes lab strains. P: neutralizes at least some primary isolates. no: does not neutralize. No information in this field means that neutralization was either not discussed or unresolved in the primary publications referring to the MAb.
Immunogen The antigenic stimulus of the original B cell response. Often this is an HIV-1 infection. If a vaccine was used as the original antigenic stimulation, not a natural infection, this is noted on a separate line, and additional information about the vaccine antigen is provided as available.
Species(Isotype) The host that the antibody was generated in, and the isotype of the antibody.
Donor Information about an antibody or how to obtain it, as well as to provide credit.
References All publications that we could find that refer to the use of a specific monoclonal antibody. First is a list of all references. Some of the earlier references include notes with additional details, although we have tried to keep the entries self-contained since 1997.
Notes Describe the context of each study, and what was learned about the antibody in the study.
The names of MAbs and the location of well characterized linear binding sites of 21 amino acids or less are indicated relative to the protein sequences of the HXB2 clone. This map is meant to provide the relative location of epitopes on a given protein, but the HXB2 sequence may not actually bind to the MAb of interest, as it may vary relative to the sequence for which the epitope was defined. Above each linear binding site, the MAb name is given followed by the species in parentheses. Human is represented by h, non-human primate by p, mouse by m, and others by o. More precise species designations for any given MAb can be found using the web search interface.
Epitope alignments can be generated using the epitope search tools. All epitopes are aligned to the HXB2 sequence, with the sequence used to define the epitope indicated directly above it. Sequences are sorted by their subtype and country of origin.
The master alignment files from which the epitope alignments were
created are available at our sequence web site (HIV web alignments).
The alignments were modified in some cases to optimize the alignment
relative to the defined epitope and minimize insertions and deletions;
epitope alignments are generated by anchoring on the C-terminal
residue. A dash indicates identity to the consensus sequence, and a
period indicates an insertion made to maintain the alignment. Stop
codons are indicated with a $
, and frameshifts by a
#
, or ambiguous codons (nucleotide was r
,
y
, or n
) by an x
; they are
inserted to maintain the alignments. In consensus sequences an upper
case letter indicates the amino acid was present in all sequences, a
lower case letter indicates the amino acid was present in most
sequences in a given position, and a question mark indicates two or
more amino acids were represented with equal frequency.
This is a brief description of the database fields in the search and results pages. Please see above for more details.
Last modified: Tue Mar 6 11:41:37 MST 2007