Elsevier

Human Immunology

Volume 63, Issue 9, September 2002, Pages 701-709
Human Immunology

Original contribution
Prediction of MHC class I binding peptides using profile motifs

https://doi.org/10.1016/S0198-8859(02)00432-9Get rights and content

Abstract

Peptides that bind to a given major histocompatibility complex (MHC) molecule share sequence similarity. Therefore, a position specific scoring matrix (PSSM) or profile derived from a set of peptides known to bind to a specific MHC molecule would be a suitable predictor of whether other peptides might bind, thus anticipating possible T-cell epitopes within a protein. In this approach, the binding potential of any peptide sequence (query) to a given MHC molecule is linked to its similarity to a group of aligned peptides known to bind to that MHC, and can be obtained by comparing the query to the PSSM. This article describes the derivation of alignments and profiles from a collection of peptides known to bind a specific MHC, compatible with the structural and molecular basis of the peptide-MHC class I (MHCI) interaction. Moreover, in order to apply these profiles to the prediction of peptide-MHCI binding, we have developed a new search algorithm (RANKPEP) that ranks all possible peptides from an input protein using the PSSM coefficients. The predictive power of the method was evaluated by running RANKPEP on proteins known to bear MHCI Kb- and Db-restricted T-cell epitopes. Analysis of the results indicates that > 80% of these epitopes are among the top 2% of scoring peptides. Prediction of peptide-MHC binding using a variety of MHCI-specific PSSMs is available on line at our RANKPEP web server (www.mifoundation.org/Tools/rankpep.html). In addition, the RANKPEP server also allows the user to enter additional profiles, making the server a powerful and versatile computational biology benchmark for the prediction of peptide-MHC binding.

Introduction

Major histocompatibility complex (MHC) molecules play a key role in the immune system by capturing peptide antigens for display on cell surfaces. Different MHC molecules bind distinct sets of peptides. Subsequently, these peptide-MHC complexes (pMHC) are recognized by T cells via their T-cell receptors (TCR) (reviewed in references 1, 2, 3, 4). T-cell recognition is thus restricted to those peptides that the MHC molecules can present. Therefore, prediction of peptides that can bind to MHC molecules is important for identification of peptides capable of eliciting a T-cell response.

There are two major classes of MHC molecules, class I and class II (MHCI and MHCII, respectively) that, despite their structural similarity, differ in many ways [5]. MHCI are recognized by CD8 cytotoxic T lymphocytes (CTL), whereas MHCII are recognized by CD4 helper T cells. The type of peptide that MHCI and MHCII bind is also different. MHCI molecules bind short peptides, usually between 8 and 10 residues, with their N- and C-terminal ends pinned in the peptide binding groove [1]. In contrast, peptides bound to MHCII are longer, more variable in length (9 to 22 residues), and both the N- and C-terminal ends of the peptide can extend beyond the peptide binding groove 1, 3. The binding motifs for MHCII are less well defined than those for MHC class I [6]. In this article, we will focus on prediction of peptide binding to MHCI molecules.

MHCI binding peptides are related by sequence similarity, and therefore prediction of pMHCI binding has traditionally been accomplished using sequence motif patterns as predictors [7]. These sequence patterns are usually extracted from large numbers of existing known peptides, or from pool sequencing experiments 6, 8. The specific amino acids present in the pattern are called anchor residues, and the positions where they occur are termed anchor positions [8]. For example, the sequence patterns described [8] for Kb octamers and Db nanomers are the following:

Kb X-X-Y-X-[YF]-X-X-[LMIV]

Db X-X-X-X-N-X-X-X-[LMIV].

Such sequence patterns, however, have proven to be too simple, as the binding ability of a peptide to a given MHC molecule cannot be explained exclusively in terms of the presence or absence of a few anchor residues 9, 10. In response to these limitations, motif matrices have also been developed to account for the preference of every amino acid type at every position in the peptide 6, 11. Coefficients in these matrices relate to the strength of the amino acid signals in a pool sequence of peptides eluted from a given MHCI molecule, or to the occurrence of an amino acid in a set of binding peptides. However, the precise way in which the coefficients are derived is not clear.

The above matrices offer two good efforts at representing the complexity of MHCI binding motifs 6, 11. Nevertheless, it is well-established that position specific scoring matrices (PSSM) or profiles created from a set of aligned sequences provides a better way for defining and recognizing sequence motifs [12]. There are several methods to generate PSSM from aligned sequences, usually including distinct sequence weighting methods 13, 14. In all cases, profile coefficients relate to the observed frequency of every amino acid at the position column of the alignment, corrected by the expected frequency of that amino acid in the background using a reference database. Thus, in this approach the binding potential of any peptide (query) to a given MHC molecule can be obtained by comparing the query to a PSSM created from a set of aligned MHCI-specific peptides. In this article we describe a new search algorithm, RANKPEP, that ranks all possible peptides from a test protein using PSSM coefficients. In addition, this study describes, for Kb and Db molecules, that profiles created from aligned peptides are very sensitive in identifying MHCI-restricted epitopes. These profiles are guided by recent structural data indicating differences in binding residues involving peptides of distinct length. Peptide-MHC binding prediction using PSSMs are available at our RANKPEP web server (www.mifoundation.org/Tools/rankpep.html), where users can select the provided PSSMs or enter their own.

Section snippets

Peptide and protein sequences

Sequences of peptides that bind to MHC molecules were collected from the MHCPEP database [15], which is available for downloading from the worldwide web (http://wehih.wehi.edu.au/mhcpep/). The MHC database contains 13,423 peptide entries distributed between 281 MHC specificities. All peptides in the MHCPEP database are binders, but their binding strength for specific MHC molecule is reported as unknown, low, moderate or high. This work has excluded MHC class I ligands that were ranked as low

MHCI molecules: correlation between structure and specificity

Peptides bound to an MHCI molecule are in an extended conformation with several side chains accommodated in the binding pockets of the MHCI binding groove (Figure 1), and the N- and C-terminal pinned into the groove, connected by a network of hydrogen bonds with conserved residues of the MHCI molecule 1, 23, 24. In turn, the binding pockets of the MHCI are delineated by polymorphic side chains, providing the molecular basis for the peptide specificity of the different MHC molecules and the

Conclusions

CTL responses rely on the recognition of peptides that must be presented on the target cell surface by MHC class I molecules. Therefore, determination of peptides that bind to MHCI molecules is important and has been approached by several methods, including quantitative matrices 26, 27, 28, neural networks 29, 30, and peptide threading 31, 32. Although a direct comparison between the various methods is not straightforward due to the different criteria followed by authors to assess the power of

Acknowledgements

This work was supported by NIH grant AI50900. Doctor Reche is supported by funds from the Molecular Immunology Foundation. We thank Drs. Linda Clayton, Masha Fridkis-Hareli, Rob Meijers, Esther Lafuente, Bruce Reinhold, and Jia-huai Wang for reading and comments.

References (42)

  • K Gulukota et al.

    Two complementary methods for predicting peptides binding major histocompatibility complex molecules

    J Mol Biol

    (1997)
  • Y Altuvia et al.

    A structure-based algorithm to predict potential binding peptides to MHC molecules with hydrophobic binding pockets

    Hum Immunol

    (1997)
  • R.A Sayle et al.

    RASMOLbiomolecular graphics for all

    Trends Biochem Sci

    (1995)
  • D.R Madden

    The three-dimensional structure of peptide-MHC complexes

    Annu Rev Immunol

    (1995)
  • S.G.E Marsh et al.

    The HLA Facts Book

    (2000)
  • H.G Rammensee et al.

    SYFPEITHIdatabase for MHC ligands and peptide motifs

    Immunogenetics

    (1999)
  • K Falk et al.

    Allele-specific motifs revealed by sequencing of self-peptides eluted from MHC molecules

    Nature

    (1991)
  • M Bouvier et al.

    Importance of peptide amino acid and carboxyl termini to the stability of MHC class I molecules

    Science

    (1994)
  • A.S De Groot et al.

    An interactive web site providing major histocompatibility ligand predictionsapplication to HIV research and AIDS

    AIDS Res Hum Retroviruses

    (1997)
  • M Gribskov et al.

    Profile analysisdetection of distantly related proteins

    Proc Natl Acad Sci USA

    (1987)
  • J.G Henikoff et al.

    Substitution probabilities to improve position-specific scoring matrices

    Comput Appl Biosci

    (1996)
  • Cited by (309)

    • A structural vaccinology approach for in silico designing of a potential self-assembled nanovaccine against Leishmania infantum

      2022, Experimental Parasitology
      Citation Excerpt :

      In this study, the IEDB recommended prediction method was selected (Wang et al., 2008). The RANKPEP was another online server that was employed to predict MHC-II binding epitopes based on PSSM at a 4–6% binding threshold (Reche et al., 2002). MHCPred V.2.0 uses a QSAR stimulation method and computes IC50 for MHC-II binding epitopes.

    View all citing articles on Scopus
    View full text