HIV molecular immunology database
HLA binding motif scanner allows you to find HLA anchor residue motifs within protein sequences for specified HLA serotypes, genotypes or supertypes. The potential epitopes are included in the output. Two major motif libraries were used:
The motifs presented are linked to their sources, and you can choose which one to use for scanning the sequences. We also constantly search the literature for the new motifs, not listed in these two major sources. What we find is presented as an additional source. You also can use your own custom motif, which can be composed based on the information we present and on your own data.
The supermotifs and supertypes classification is taken from
Supermotifs indicate the residues defining supertype specificities. The supermotifs incorporate residues that are recognized by multiple alleles within the supertype.
This tool searches for anchor motifs only. If you want additional information on auxiliary amino acids, please look at the original motif libraries. However, you can still use our tool with the auxiliary amino acids if you compose your own custom motif using the information from these sources.
View or download the HLA genotype/serotype dictionary.
View or download the HLA genotype/motif dictionary.
View or download the HLA supertype dictionary.
x-[VTILF]-x-x-x-x-x-x-[YF(ML)]
. This means that
second and C-terminal positions are anchor positions. The
dominant amino acids at the second position are
V
, T
, I
,
L
, F
and at the C-terminal anchor
position the dominant amino acids are Y
and
F
, while M
and L
are
the preferred but not dominant. Note that as a default, unless
you specify your own motif, we will search on all anchor
position amino acids, both dominant and preferred but not
dominant, so the information on which amino acids are less
dominant is presented for your information only. However, if
you want to search on the dominant amino acids only, you can
compose your own motif using the information we present. Also,
should you have any questions of how it was decided which
amino acid is dominant and which is not, please address them
to the authors who published these motifs.
[]
, and
enter arbitrary residues with an x
. You may
optionally use a dash (-
) to separate the
residues. For example, x[LM]xxx[K]xx[V]
or
x-[LM]-x-x-x-[K]-x-x-[V]
.
The sequences consist of the amino acid codes:
ACDEFGHIKLMNPQRSTVWYBZX
and the gap code
-
. All other characters are removed and ignored.
Gaps are ignored unless the input sequences form an alignment. Two sequence formats are accepted, FASTA and Table, and examples of these formats are shown below. For more information about sequence formats, see Common Sequence Formats.
>sequence_a MENRWQVMIVWQVDRMRIRTWKSLVKHHMYVSGKARGWFYRHHYESPHPR ISSEVHIPLGDARLVITTYWGLHTGERDWHLGQGVSIEWRKKRYSTQVDP ELADQLIHLYYFDCFSDSAIRKALLGHIVSPRCEYQAGHNKVGSLQYLAL AALITPKKIKPPLPSVTKLTEDRWNKPQKTKGHRGSHTMNGH >sequence_b MEQAPEDQGPQREPHNEWTLELLEELKNEAVRHFPRIWLHGLGQHIYETY GDTWAGVEAIIRILQQLLFIHFRIGCRHSRIGVTRQRRARNGASRS >sequence_c MEPVDPRLEPWKHPGSQPKTACTNCYCKKCCFHCQVCFITKALGISYGRK KRRQRRRAHQNSQTHQASLSKQPTSQPRGDPTGPKExKKKVERETETDPF D
sequence_a MENRWQVMIVWQVDRMRIRTWKSLVKHHMYVSGKARGWFYRHHYESPHPR sequence_b MEQAPEDQGPQREPHNEWTLELLEELKNEAVRHFPRIWLHGLGQHTY sequence_c MEPVDPRLEPWKHPGSQPKTACTNCYCKKCCFHCQVCFITKALGISYGRKD
The result of the program is presented in several ways. First, the motifs corresponding to the input HLA type are presented. Then, you choose which motifs to scan against, choose motif length, load your sequences or choose predefined sequences, and scan these sequences for the respective motifs.
The final output is organized by search pattern---all motifs with identical search patterns are grouped together. The matching binding motifs are presented on the input sequences in two colors: C-terminal anchor amino acids are shown in magenta and anchor amino acids in the other positions are shown in cyan. If a given amino acid is matched by more than one motif, then it is highlighted as a C-terminal anchor amino acid if any of the motifs are matched at the C-terminal anchor. All anchor amino acids are shown in uppercase and non-anchors are lowercase. Following the sequences is a list of potential epitopes showing their positions in the input sequences.
You can also view and download the resulting sequences in the FASTA format where the anchor amino acids are presented in uppercase and all the remaining ones in lowercase. The potential epitopes can be also downloaded in CSV (comma-separated value) format which can be read into a spreadsheet. This output is convenient for further analysis.
Last modified: Thu Jun 9 09:04:55 MDT 2005