Original contributionPrediction of MHC class I binding peptides using profile motifs
Introduction
Major histocompatibility complex (MHC) molecules play a key role in the immune system by capturing peptide antigens for display on cell surfaces. Different MHC molecules bind distinct sets of peptides. Subsequently, these peptide-MHC complexes (pMHC) are recognized by T cells via their T-cell receptors (TCR) (reviewed in references 1, 2, 3, 4). T-cell recognition is thus restricted to those peptides that the MHC molecules can present. Therefore, prediction of peptides that can bind to MHC molecules is important for identification of peptides capable of eliciting a T-cell response.
There are two major classes of MHC molecules, class I and class II (MHCI and MHCII, respectively) that, despite their structural similarity, differ in many ways [5]. MHCI are recognized by CD8 cytotoxic T lymphocytes (CTL), whereas MHCII are recognized by CD4 helper T cells. The type of peptide that MHCI and MHCII bind is also different. MHCI molecules bind short peptides, usually between 8 and 10 residues, with their N- and C-terminal ends pinned in the peptide binding groove [1]. In contrast, peptides bound to MHCII are longer, more variable in length (9 to 22 residues), and both the N- and C-terminal ends of the peptide can extend beyond the peptide binding groove 1, 3. The binding motifs for MHCII are less well defined than those for MHC class I [6]. In this article, we will focus on prediction of peptide binding to MHCI molecules.
MHCI binding peptides are related by sequence similarity, and therefore prediction of pMHCI binding has traditionally been accomplished using sequence motif patterns as predictors [7]. These sequence patterns are usually extracted from large numbers of existing known peptides, or from pool sequencing experiments 6, 8. The specific amino acids present in the pattern are called anchor residues, and the positions where they occur are termed anchor positions [8]. For example, the sequence patterns described [8] for Kb octamers and Db nanomers are the following:
.
Such sequence patterns, however, have proven to be too simple, as the binding ability of a peptide to a given MHC molecule cannot be explained exclusively in terms of the presence or absence of a few anchor residues 9, 10. In response to these limitations, motif matrices have also been developed to account for the preference of every amino acid type at every position in the peptide 6, 11. Coefficients in these matrices relate to the strength of the amino acid signals in a pool sequence of peptides eluted from a given MHCI molecule, or to the occurrence of an amino acid in a set of binding peptides. However, the precise way in which the coefficients are derived is not clear.
The above matrices offer two good efforts at representing the complexity of MHCI binding motifs 6, 11. Nevertheless, it is well-established that position specific scoring matrices (PSSM) or profiles created from a set of aligned sequences provides a better way for defining and recognizing sequence motifs [12]. There are several methods to generate PSSM from aligned sequences, usually including distinct sequence weighting methods 13, 14. In all cases, profile coefficients relate to the observed frequency of every amino acid at the position column of the alignment, corrected by the expected frequency of that amino acid in the background using a reference database. Thus, in this approach the binding potential of any peptide (query) to a given MHC molecule can be obtained by comparing the query to a PSSM created from a set of aligned MHCI-specific peptides. In this article we describe a new search algorithm, RANKPEP, that ranks all possible peptides from a test protein using PSSM coefficients. In addition, this study describes, for Kb and Db molecules, that profiles created from aligned peptides are very sensitive in identifying MHCI-restricted epitopes. These profiles are guided by recent structural data indicating differences in binding residues involving peptides of distinct length. Peptide-MHC binding prediction using PSSMs are available at our RANKPEP web server (www.mifoundation.org/Tools/rankpep.html), where users can select the provided PSSMs or enter their own.
Section snippets
Peptide and protein sequences
Sequences of peptides that bind to MHC molecules were collected from the MHCPEP database [15], which is available for downloading from the worldwide web (http://wehih.wehi.edu.au/mhcpep/). The MHC database contains 13,423 peptide entries distributed between 281 MHC specificities. All peptides in the MHCPEP database are binders, but their binding strength for specific MHC molecule is reported as unknown, low, moderate or high. This work has excluded MHC class I ligands that were ranked as low
MHCI molecules: correlation between structure and specificity
Peptides bound to an MHCI molecule are in an extended conformation with several side chains accommodated in the binding pockets of the MHCI binding groove (Figure 1), and the N- and C-terminal pinned into the groove, connected by a network of hydrogen bonds with conserved residues of the MHCI molecule 1, 23, 24. In turn, the binding pockets of the MHCI are delineated by polymorphic side chains, providing the molecular basis for the peptide specificity of the different MHC molecules and the
Conclusions
CTL responses rely on the recognition of peptides that must be presented on the target cell surface by MHC class I molecules. Therefore, determination of peptides that bind to MHCI molecules is important and has been approached by several methods, including quantitative matrices 26, 27, 28, neural networks 29, 30, and peptide threading 31, 32. Although a direct comparison between the various methods is not straightforward due to the different criteria followed by authors to assess the power of
Acknowledgements
This work was supported by NIH grant AI50900. Doctor Reche is supported by funds from the Molecular Immunology Foundation. We thank Drs. Linda Clayton, Masha Fridkis-Hareli, Rob Meijers, Esther Lafuente, Bruce Reinhold, and Jia-huai Wang for reading and comments.
References (42)
Interactions of TCRs with MHC-peptide complexesa quantitative basis for mechanistic models
Curr Opin Immunol
(1997)- et al.
Antigen peptide binding by class I and class II histocompatibility proteins
Structure
(1994) - et al.
Structural basis of cell-cell interactions in the immune system
Curr Opin Structural Biol
(2000) - et al.
A computer program for predicting possible cytotoxic T lymphocyte epitopes based on HLA class I peptide binding motifs
Hum Immunol
(1995) - et al.
Prominent role of secondary anchor residues in peptide binding to HLA-A21 molecules
Cell
(1993) - et al.
Position-based sequence weights
J Mol Biol
(1994) - et al.
Weighting aligned protein or nucleic acid sequences to correct for unequal representation
J Mol Biol
(1990) - et al.
Structural principles that govern the peptide-binding motifs of class I MHC molecules
J Mol Biol
(1998) - et al.
The antigenic identity of peptide-MHC complexesa comparison of the conformations of five viral peptides presented by HLA-A2
Cell
(1993) - et al.
Prediction of binding to MHC class I molecules
J Immunol Methods
(1995)
Two complementary methods for predicting peptides binding major histocompatibility complex molecules
J Mol Biol
A structure-based algorithm to predict potential binding peptides to MHC molecules with hydrophobic binding pockets
Hum Immunol
RASMOLbiomolecular graphics for all
Trends Biochem Sci
The three-dimensional structure of peptide-MHC complexes
Annu Rev Immunol
The HLA Facts Book
SYFPEITHIdatabase for MHC ligands and peptide motifs
Immunogenetics
Allele-specific motifs revealed by sequencing of self-peptides eluted from MHC molecules
Nature
Importance of peptide amino acid and carboxyl termini to the stability of MHC class I molecules
Science
An interactive web site providing major histocompatibility ligand predictionsapplication to HIV research and AIDS
AIDS Res Hum Retroviruses
Profile analysisdetection of distantly related proteins
Proc Natl Acad Sci USA
Substitution probabilities to improve position-specific scoring matrices
Comput Appl Biosci
Cited by (309)
Challenges in neoantigen-directed therapeutics
2023, Cancer CellNeoantigen identification: Technological advances and challenges
2023, Methods in Cell BiologyAn in silico reverse vaccinology approach to design a novel multiepitope peptide vaccine for non-small cell lung cancers
2023, Informatics in Medicine UnlockedVaccinomics strategy to design an epitope peptide vaccine against Helicobacter pylori
2022, Process BiochemistryA structural vaccinology approach for in silico designing of a potential self-assembled nanovaccine against Leishmania infantum
2022, Experimental ParasitologyCitation Excerpt :In this study, the IEDB recommended prediction method was selected (Wang et al., 2008). The RANKPEP was another online server that was employed to predict MHC-II binding epitopes based on PSSM at a 4–6% binding threshold (Reche et al., 2002). MHCPred V.2.0 uses a QSAR stimulation method and computes IC50 for MHC-II binding epitopes.
In-silico screening of potential target transporters for glycyrrhetinic acid (GA) via deep learning prediction of drug-target interactions
2022, Biochemical Engineering Journal