Article Text
Abstract
Background Tumor-associated antigens and their derived peptides constitute an opportunity to design off-the-shelf mainline or adjuvant anti-cancer immunotherapies for a broad array of patients. A performant and rational antigen selection pipeline would lay the foundation for immunotherapy trials with the potential to enhance treatment, tremendously benefiting patients suffering from rare, understudied cancers.
Methods We present an experimentally validated, data-driven computational pipeline that selects and ranks antigens in a multipronged approach. In addition to minimizing the risk of immune-related adverse events by selecting antigens based on their expression profile in tumor biopsies and healthy tissues, we incorporated a network analysis-derived antigen indispensability index based on computational modeling results, and candidate immunogenicity predictions from a machine learning ensemble model relying on peptide physicochemical characteristics.
Results In a model study of uveal melanoma, Human Leukocyte Antigen (HLA) docking simulations and experimental quantification of the peptide–major histocompatibility complex binding affinities confirmed that our approach discriminates between high-binding and low-binding affinity peptides with a performance similar to that of established methodologies. Blinded validation experiments with autologous T-cells yielded peptide stimulation-induced interferon-γ secretion and cytotoxic activity despite high interdonor variability. Dissecting the score contribution of the tested antigens revealed that peptides with the potential to induce cytotoxicity but unsuitable due to potential tissue damage or instability of expression were properly discarded by the computational pipeline.
Conclusions In this study, we demonstrate the feasibility of the de novo computational selection of antigens with the capacity to induce an anti-tumor immune response and a predicted low risk of tissue damage. On translation to the clinic, our pipeline supports fast turn-around validation, for example, for adoptive T-cell transfer preparations, in both generalized and personalized antigen-directed immunotherapy settings.
- Immunotherapy
- Antigens, Neoplasm
- Self Tolerance
- Computational Biology
- Systems Biology
Data availability statement
Data are available in a public, open access repository. Data are available upon reasonable request. The public transcriptomics data used to prioritize genes are available on the GDC Data Portal under the Project ID TCGA-UVM (https://portal.gdc.cancer.gov/projects/TCGA-UVM). All data (transcriptomics or otherwise) generated by this study are available upon reasonable request to the corresponding author.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
WHAT IS ALREADY KNOWN ON THIS TOPIC
Selection pipelines for anti-cancer peptides are frequently deployed on a limited set of preselected protein antigens, mostly focus on modeling the peptide–major histocompatibility complex (MHC) interaction and the peptide’s immunogenicity, and lack a validation step.
WHAT THIS STUDY ADDS
Our pipeline evaluates transcriptome-wide expression to prioritize genes and incorporates complementary measures of antigen suitability, including sequence identity with unrelated antigens, antigen indispensability for tumor growth and a re-evaluation of MHC binding affinity and immunogenicity.
HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY
Controlled, automated consideration of features besides efficacy, namely off-target effects and unfavorability for the tumor to suppress the antigen target, might improve both safety and efficacy in future immunotherapy trials and treatments.
Background
Targeted anti-cancer immunotherapy can be directed against neoantigens, arising somatically in an individual, or tumor-associated antigens (TAAs) which are unmutated loci restrictively expressed in the tumor.1 In recent years, these approaches have found broad application in clinical trials and especially TAAs promise cohort-specific treatment options with different strategies.2 The approaches use different vector systems to ultimately stimulate or exploit an adaptive immune response against the targeted antigen by engaging peptide-loaded major histocompatibility complex class I (MHC-I) interactions with activated cytotoxic T lymphocytes (CTLs).3 However, one key challenge in making these therapies more efficacious lies in finding antigens that are broadly applicable and induce a durable anti-tumor response without serious off-tissue effects.4 Intricacies like immune-evasive antigen loss in the tumor and life-threatening adverse events are likely responsible for the scarcity of approved antigen-targeted immunotherapies.5 6 Hence, given the increasing interest in these therapies, there is a need for computational pipelines which include biomedical parameters such as the risk of side effects, tumor restriction and immunogenic potential into practical frameworks. We intended to address this need with our work, applying our methodology to find antigens against metastatic uveal melanoma (UM).
UM is the most frequent primary ocular malignancy in adults and a poorly treatable cancer.7 8 Approximately half of the patients develop metastases within 10 years, usually in the liver, with a median post-metastasis survival time of less than a year.9 Despite recent advances with tebentafusp (tradename Kimmtrak) in HLA-A*02:01-positive patients, treatment options for metastasized UM show limited efficacy.8
Our goal in this study was to develop a data-driven computational predictor for TAAs with a predicted low risk of tissue damage that addresses more of the above intricacies. In an ensemble model approach, we combined expression in tumor and tissue, tumor gene importance, and database knowledge to predict the treatment efficacy and tolerability of TAAs and their derived epitopes.
To support our in-silico model, we selected epitopes of predicted high or low efficacy for UM and experimentally compared their immunogenic potential against tumor and off-target cells with that of epitopes recommended by established computational tools. We also quantified the MHC-binding affinities of the selected epitopes and confirmed that our pipeline discriminates correctly between high-affinity and low-affinity peptides. Further, we show that pools of our high-efficacy peptides elicit an interferon gamma (IFN-γ) response in autologous CD8+ T cells in vitro. Additionally, T cells primed with these candidate peptides killed cells of the UM cell line 92.1 more efficiently than T cells primed with control peptides. We provide the annotated results as a database free of charge for non-commercial use at https://www.curatopes.com/uvealmelanoma.
Methods
Overview of the bioinformatics peptide selection workflow
The methodology developed to select TAAs with a predicted low risk of tissue damage for targeted immunotherapy integrates a data-driven computational workflow and an experimental validation procedure (figure 1A). The bioinformatics workflow integrates transcriptomics, network analysis and supervised machine learning (ML) for predicting peptide binding and immunogenicity as follows:
Transcriptomics-based prioritization of TAAs minimizing immune-related adverse events (irAE). We utilized transcriptomics and histological data to select highly tumor-restricted genes that show no residual expression in all healthy tissues for which there are quantitative data available in standardized repositories. To this end, we obtained transcriptomics data from primary UM samples and healthy tissue and prioritized protein-coding cancer genes that (a) are sufficiently expressed in the majority of the inspected UM samples, (b) lack histological evidence of protein expression in normal tissues recorded in the Human Protein Atlas (HPA), and (c) display high-in-tumor, low-in-tissue expression in the transcriptomics data available in Genotype-Tissue Expression (GTEx) (RNA-seq expression in 90% of normal tissues samples lower than in 90% of UM samples).
Candidate peptide k-mer extraction and post-hoc screening. We utilized a FASTA file of the human proteome to retrieve all annotated protein sequences for the prioritized genes, and enumerated k-mer peptides of length 9–12 amino acids for each protein sequence. To further avoid cross-reactivity, we screened each candidate peptide against the complementary part of the human proteome (eg, all non-prioritized genes) and excluded peptides with literal sequence matches.
Efficacy score (ES)-based candidate peptide ranking. To select the most promising candidate peptides for experimental validation, we developed a multivariate score function (named ES) that aggregates information on gene expression, cancer gene network connectivity, peptide binding affinity and immunogenicity to rank the peptides:
The ES function is a probability chain composed of five normalized subfunctions (a–e) modeling the (a) expression of the prioritized gene in the tumor utilizing constrained tumor median expression (consTME), (b) tumor indispensability of the prioritized gene based on the prominence of its position in a network of known cancer genes (Idspx) (figure 1B and online supplemental table S1), (c) HLA allele-specific affinity of the candidate peptide computed utilizing a constrained binding affinity based on NetMHCpan predictions (consIC50) (online supplemental table S2), (d) MHC binding probability as predicted by an ML model (gBP), and (e) candidate peptide immunogenicity inferred via an ML model (gAP). gBP and gAP are in-house random forest models (RF) that use the candidate peptide’s physicochemical properties as input (hydrophobicity, isoelectrical point, molecular weight, stability index, polarity and sequence length). To generate the training set for both models, we selected reliable peptides from the MHCBN database V.4.0, and additional binder and non-binder peptides identified through crystallography experiments. To train the models, we performed 100 iterations of weighted subsampling from the training data, and for each trained an RF model with 10 000 trees. For both models, responses were discretized at the threshold of 0.5 and the respective averages of the 100 RF models’ discretized classifications were used as the probability output of gBP and gAP. Model performance for gBP and gAP was evaluated and compared with other models on independently curated validation sets from the IEDB10 (figure 1C, table 1 and online supplemental table S3). We feed the results for each candidate peptide into the function and rank the candidate peptides according to the obtained ES value.
Supplemental material
Supplemental material
A detailed description of each individual step of the bioinformatics workflow is given in online supplemental methods.
For validation, we selected the top 20 ES-ranked candidate peptides (high efficacy or HE) for HLA-A*02:01 together with 20 randomly selected candidate peptides scored ES=0, here considered negative controls (low efficacy or LE). Furthermore, to compare our results with a gold-standard approach, we selected 20 candidate peptides with an ES of zero but a standard (NetHMCpan)-predicted binding affinity that matches the HE peptides (alternative predictor or AP). These 60 peptides were synthesized at laboratory quality and 90% purity, and for a subset, their binding affinities to HLA-A*02:01 were estimated experimentally. Further, we split each peptide tier (HE, LE, and AP) into four pools of five peptides and performed in vitro experiments with HLA-A*02:01-positive healthy-donor peripheral blood mononuclear cells (PBMCs), in which we measured IFN-γ secretion by flow cytometry and ELISA. We also performed cytotoxicity assays by coculturing the HLA-A*02:01-positive UM cell line 92.1 with the peptide-stimulated PBMCs.
In-house UM sample preparation and RNA sequencing
In accordance with current regulatory and ethics standards within the context of clinical trial NCT01983748,11 patients with UM gave informed consent before tumors were surgically removed. Tumor samples were preserved in RNAlater (ThermoFisher Scientific, Waltham, Massachusetts, USA) and RNA was extracted with RNeasy Mini kits (Qiagen, Hilden, Germany) according to the manufacturer’s protocol. Transcriptome sequencing was performed by a commercial service provider (CeGat, Tübingen, Germany). For analysis, raw FASTQ files were quality controlled and aligned against the human reference genome version hg38 using STAR.12 Quantification was performed using StringTie13 against the Gencode comprehensive annotation version 28. For further details on the sample processing, see online supplemental figure S1.
In vitro testing of candidate peptides with healthy donor or patient with UM PBMCs
The selected peptides were synthesized at laboratory quality and 90% purity by GenScript (Leiden, The Netherlands). Leukapheresis products (four healthy donors) or fresh blood (two healthy donors, two patients with UM) were obtained based on their positive cytomegalovirus (CMV) and HLA-A*02:01 status while adhering to current regulatory and ethics standards including obtaining informed consent. PBMCs were purified by Ficoll gradient centrifugation (800 g, 20 min, 20°C, brake off). The four healthy-donor leukapheresis-derived PBMCs were subsequently cryopreserved in liquid nitrogen at a concentration of 100 million/mL in freezing medium containing 10% DMSO. After thawing, 1–2 million cells/mL were recovered for 18–24 hours in serum-free TexMACS GMP medium (Miltenyi Biotec, Bergisch-Gladbach, Germany) at 37°C. Before peptide stimulation, cells were harvested by centrifugation and counted. Batches of 20 million live PBMCs were stimulated per peptide pool or CMV positive control (human PepTivator CMV pp65, Miltenyi Biotec, Bergisch-Gladbach, Germany) at a total peptide concentration of 1 µg/mL, or left unstimulated. Peptide loading was performed in 20 mL of prewarmed serum-free medium for 2 hours at 37°C. Afterwards, cells were spun down and washed with medium to remove unbound peptides. Cells were then incubated at an initial concentration of 2 million/mL for 9 days at 37°C in Roswell Park Memorial Institute (RPMI) 1640 medium (Gibco by Life Technologies GmbH, Darmstadt, Germany) supplemented with 1% (v/v) GlutaMAX (Gibco by Life Technologies GmbH, Darmstadt, Germany), 50 IU/mL IL-2 (Aldesleukin, Novartis Pharma GmbH, Nürnberg, Germany) and 1% (v/v) human AB serum (Anprotec, Bruckberg, Germany). During day 5 of incubation, culture volume was increased with fresh RPMI 1640 medium with supplements to a total of 2.5 times the volume on day 0. On day 9 of stimulation, culture supernatant was probed for IFN-γ (ELISA MAX Deluxe Set, Biolegend, San Diego, USA) and stimulated PBMCs were investigated with an IFN-γ Secretion Assay (Miltenyi Biotec, Bergisch-Gladbach, Germany) and by Incucyte Live Cell Imaging (Sartorius, Göttingen, Germany) according to the manufacturers’ instructions.
The HLA-A*02:01-positive14 UM cell line 92.115—kindly provided by Klaus Griewank, University Hospital Essen—was selected as a cytotoxicity target and cultivated in UM medium containing RPMI 1640 (Gibco by Life Technologies GmbH, Darmstadt, Germany), 2 mM L-glutamine (Gibco by Life Technologies GmbH, Darmstadt, Germany), 10% fetal bovine serum (Merck, Darmstadt, Germany), and 1× Antibiotics-Antimycotics (Gibco by Life Technologies GmbH, Darmstadt, Germany) at 37°C with 5% CO2. The 92.1 cells were stained with 0.75 µM Cytolight Green (Sartorius, Göttingen, Germany) in PBS for 20 min at 37°C before the Cytotox Assay. After two washing cycles, stained 92.1 cells were seeded in a 96 well plate and incubated for 30 min at 37°C to allow for reattachment. Stimulated PBMCs were then added in an effector:target ratio of 4:1 (final volume 200 µL) and the culture medium supplemented with Annexin V Red Dye (Sartorius, Göttingen, Germany) to facilitate ongoing staining of apoptotic cells. Green and red fluorescence channels were recorded once every 60 min over a period of 45 hours. The aggregated area of red cells (µm2/image) as a measure of cell death was automatically quantified by Incucyte Base Software (Sartorius, Göttingen, Germany) and background-corrected against stained 92.1 cells cultured in the absence of PBMCs.
In vitro HLA binding validation of selected candidate peptides
Binding affinities of predicted epitopes were analyzed by a UV-mediated peptide exchange assay using in-house produced peptide*HLA-A*02:0116 (Sanquin, Amsterdam, The Netherlands), of which the heavy chain is biotinylated, as described previously.17 Briefly, peptide exchange was performed in duplicate by combining 0.53 µM conditional p*HLA complex in the presence or absence of 50 µM of the candidate peptide. The mixture was exposed for 30 min to 366 nm UV light and subsequently incubated for 30 min at 37°C. Peptide exchange efficiency was analyzed using a beta-2 microglobulin (β2M)-specific ELISA, which only detects peptide-stabilized HLA class I complexes, indicative of peptide binding as described previously.17 Briefly, exchange reactions were incubated for 1 hour at 37°C on 2 µg/mL streptavidin-coated Nunc MaxiSorp plates. Non-bound material was removed by washing. Plate binding was assessed with 0.3 µg/mL horseradish peroxidase-conjugated anti-human β2M antibody and detected with 2,2′-azino-bis(3-ethylbenzothiazoline-6-sulfonic acid) diammonium salt substrate solution. The recorded absorbances were normalized to the absorbance of a known HLA-A*02:01 peptide ligand with a high affinity (NLVPMVATV, representing 100% relative affinity). Controls included non-exchanged conditional p*HLA complex, an HLA allele-specific non-binder (IVTDFSVIK) and UV irradiation of the conditional p*HLA complex in the absence of a rescue peptide.17
Statistics
Moderated pairwise t-tests of the readouts between experimental conditions were performed with limma (Richie et al) in R, applying multiple testing correction and using donor identity as a covariate. An alpha threshold of 0.05 was applied.
Results
In this work, we constructed and deployed a pipeline to rank the predicted therapeutic efficacy of potential antigenic epitopes for a specific tumor of interest (see figure 1A and the Methods section for a detailed description). In the following, we illustrate the use of the pipeline, utilizing as case study the selection of efficacious and tolerable peptides for UM therapy.
Antigen and peptide selection to minimize the risk of irAE
We retrieved from public repositories the empirical mRNA abundancies of 80 primary UM biopsies18 and compared them with the GTEx data set of healthy tissues to select protein-coding genes with a high-in-tumor, low-in-tissue expression profile.19 This first step is supposed to reduce the risk of severe irAEs in a later immunotherapy setting (figure 2A). We found 9556 protein-coding genes with sufficient baseline expression in UM, of which 1722 turned out favorable due to lack of histological evidence of protein expression in healthy tissue according to the HPA (figure 2B). After filtering against GTEx, we were left with 22 protein-coding genes. Next, we confirmed that these prioritized genes are stably expressed in an independent cohort of 14 primary UM samples obtained at the Department of Ophthalmology of the Uniklinikum Erlangen (figure 2C). Interestingly, melanocyte-derived antigens like MLANA, TYR and PMEL are stably expressed across our selected UM samples.20 Moreover, TMEM200C has been recently identified as a potential marker for progression in UM.21
To further avoid peptide cross-reactivity, we first extracted all overlapping k-mer peptides of length 9–12 for all proteins expressed from the 22 prioritized genes. Next, we screened these peptides against the complementary proteome, that is, all human protein sequences arising from non-prioritized genes, and discarded peptides with literal sequence matches. As seen in figure 2D, when applying this additional filter, we found that the amount of selected tumor peptides depends on the extent of the overlap between each candidate tumor protein and the rest of the human proteome. For example, TRPM1 belongs to a family of highly conserved proteins,22 and therefore the vast majority of its peptides were discarded in the above filter. Also, more than 90% of peptides from the OCA2 gene, which encodes for a 12-transmembrane domain protein with homology to a superfamily of permeases,23 are filtered out. Overall, 9 of the 22 prioritized genes featured peptides in their expressed protein isoforms that were discarded (ABCB5, ELFN1, CABLES1, TMEM200C, SLC45A2, ALX1 and RAB38, OCA2 and TRPM1). In total, we removed 11 343 of the original 51 374 unique peptides, leaving 40 031 for subsequent analysis.
Ranking of antigens based on their importance in a cancer-targeted interaction network
We hypothesized that selecting peptides from TAAs with a role in key cancer processes may inhibit the tumor’s ability to evade therapy by selectively suppressing gene expression.5 To substantiate this hypothesis, we generated simulations from an agent-based computational model24 reflecting the interplay between melanoma and immune cells in the tumor microenvironment (figure 3A and online supplemental mat). The simulations indicate that antigens linked to central cancer cell functions (eg, cell cycle) are less prone to expression suppression, and targeting their derived epitopes leads to a more effective depletion of cancer cells. In contrast, targeting epitopes from bystander proteins not associated with cancer pathways led to inefficient tumor control and resistance to the therapeutic CTLs. This suggests that peptides arising from proteins linked to pathways and cellular functions central to cancer progression are favorable targets.
Supplementary video
Supplementary video
Supplementary video
Supplementary video
To incorporate the above idea, we generated a score describing a gene’s cancer importance (GI) by counting the associations between the gene and cancer pathways taken from a manually curated list of cancer-relevant GO terms and other databases of cancer genes (figure 1B, see online supplemental mat). Some well-studied cancer genes score very high in GI, with TP53 as the top-scoring gene (GI=235) and TNF in the second rank (GI=122). Our candidate genes had a maximum GI value of 14 for TMEM200C, while C14orf169 and PNMA6A had a GI value of zero, indicating no established direct association with cancer pathways. The melanocyte antigen MLANA had a GI value of two, the same as the melanoma-associated gene TYRP1.
As proteins execute their functions while embedded in biochemical networks, we pooled data from publicly available databases to reconstruct an interaction network around our 22 selected UM genes, all genes annotated in DriverDBv325 and their direct interactors. Next, we derived a mathematical function to define the importance of a gene in the cancer network based on the gene’s GI and that of its direct interactors (gene indispensability index, Idspx, see online supplemental mat). The Idspx attempts to rank the candidate genes in terms of how hard it is for the tumor to evade immunotherapy against an antigen. We computed this metric for our 22 candidate genes and all the other genes belonging to the network (figure 3C). We found that well-known cancer genes were top-ranked genes according to this metric, including YBX1 (rank 1), FOXP3 (rank 2), MYC (rank 9), and TP53 (rank 13). Our prioritized genes were widely distributed across the interval of the Idspx values, with CABLES1 ranked highest (rank 679, Idspx=0.8837) and C11orf71 ranked lowest (rank 12 928, Idspx=0.375, figure 3D).
Peptide ranking according to a data-driven, network-driven and model-driven computational score
The aim of our analysis was to rank peptides based on the expression and network indispensability of the UM prioritized genes, as well as on the peptides’ MHC binding affinity and their capability to elicit an immune response. To this end, we derived a score function (ES) that was formalized in mathematical terms as a probability chain composed of five normalized subfunctions (a–e) accounting for the tumor expression of the prioritized gene (a) and its importance in a cancer network as quantified by Idspx (b), the MHC binding probability (c) and the immunogenicity of their derived peptides as predicted by ML models (d), and the allele-specific peptide binding affinity (e, see online supplemental mat).
For all the peptides selected based on the high-in-tumor, low-in-tissue expression of its associated tumor antigen and the minimization of the probability to elicit irAE, we retrieved the data necessary for calculating the five subfunctions and estimated their ES. We calculated the ES for all pairwise combinations of the 40 031 peptides identified above and 36 frequently studied HLA alleles (approx. 1.4 million epitopes). Formally, the ES ranges from 0 (the worst score) to 100 (the best score), but in our case study of UM only 47 408 HLA-epitope combinations yielded a non-zero score (3.3%). Two genes, FNDC10 and SMIM10L1, were assigned scores of 0 for all their peptides because the genes’ Idspx was 0. Only 1534 epitopes derived from 17 genes had an ES of at least 1 (figure 4A). In this subset, the median ES was 2.84 and the maximum 42.27, which corresponds to a peptide derived from the gene TSPAN10 (figure 4B). We further investigated the distribution of ES values higher than zero for the remaining 17 candidate genes and found that most gene candidates offered a broad range of potential epitopes to select from. MLANA, a known immunogenic antigen in metastatic CM, had a relatively low score range (1 to 15.81) compared with the other prominent CM antigens like PMEL (1–31.05) or TYR (1–36.55) (figure 4B). ALX1 had only one non-zero peptide (ES=3.52).
We further extracted the top 1% of epitopes for a closer investigation, yielding 16 epitopes generated from 6 genes with an ES higher than 28.81 (figure 4C). TSPAN10 generated five of these epitopes for three different alleles; seeing that one was specific for HLA-A*02:01, broad applicability may be feasible for this antigen. TYRP1 produced three high-ranking epitopes; this gene has been investigated in some clinical trials using monoclonal antibodies with limited success in relapsed CM patients.26 27 TYR, another TAA associated with CM, produced two highly ranked epitopes for two different alleles. CABLES1 presented with only one highly ranked candidate and is generally not associated with UM or CM apart from containing a potential driver mutation site.28 We obtained three top-ranked peptides from PMEL, another pigmentation gene and CM antigen, which is considered a highly ranked candidate biomarker in UM.29 Finally, the gene OCA2 produced two top-ranked peptides for the allele A*02:01. This gene is an attractive target for direct epitope intervention and other therapeutic options since it is a transmembrane protein playing a role in pigmentation and melanin synthesis, which is used as prognostic and predictive marker for CM and primary UM.
Selection of peptides for experimental validation and assessment of peptide-HLA binding in silico and in vitro
To validate our strategy, we selected the top 20 peptides as ranked by ES (HE tier) for the prevalent allele HLA-A*02:01 together with 20 randomly selected peptides with an ES of zero across all alleles as negative controls (LE tier). Furthermore, to compare our results with a gold-standard approach, we selected 20 candidate peptides with an ES of zero but a NetMHCpan-predicted binding affinity distribution that matched the HE tier’s (AP tier). We had the 60 selected peptides chemically synthesized by a commercial provider, with three ultimately failing synthesis (table 2).
Figure 5 shows a heatmap summarizing the features of the peptides belonging to the HE, LE and AP tiers. Together with their amino acid sequence, we visualized the values for the physicochemical features utilized as input for the ML models, their assessment in the five subfunctions and their ES values. The HE peptides cluster together with higher average values for hydrophobicity and lower values for isoelectric point, molecular weight and polarity. The tumor median expression is similar in all peptides, independent of their tier, with the exception of four peptides derived from PMEL, whose expression is above average. The HE peptides perform in general better in the subfunction accounting for immunogenicity (gAP). On average, HE peptides also feature a higher Idspx. The value of this subfunction is more variable for the other peptide groups. Interestingly, three peptides from the LE and AP tiers cluster together with the HE peptides and display comparable values for all the metrics except the Idspx. This suggests that peptides that would be good candidates according to most of the subfunctions were discarded due to the poor cancer-network connectivity of their antigen.
To compare the tier predictions with state-of-the-art molecular-level computational analysis, docking and molecular dynamics simulations were carried out in a blinded fashion, that is, without knowledge by the operator of tier assignment, for each of the 60 candidates by pairing HLA-A*02:01 with the respective peptide (see online supplemental mat). An analysis of variance analysis of the extracted free energy values showed that they were significantly different between the tiers, with the HE tier characterized by stronger binding (online supplemental figure S3A), indicating that our method tends to select peptides that can form stable complexes with MHC.
To further compare the peptides’ suitability, we experimentally quantified their binding affinities utilizing a UV-mediated peptide exchange assay for a sample of 33 peptides from the three tiers (online supplemental figure S3 Right and online supplemental table S4). The tested peptides from the LE tier display a relative affinity (ie, compared with a high-affinity peptide) significantly smaller than that of the HE and AP tiers. The average relative affinity of the AP tier is higher than that of the HE tier, but the distribution of relative affinities of HE peptides is more compact, with mean and median values almost identical and all relative affinities between 38% and 106%. In contrast, the AP group contains peptides with extremely high (155) and extremely low (5) outliers of relative affinity. The LE group contains two peptides with rather high relative affinities (76% and 56%). In our view, this reflects the features of the procedure we followed to train our ML model for binding probability, in which we favored a very low false positive rate to ensure that peptides of high rank were true binders.
Supplemental material
Functional validation of peptides in in vitro coculture assays with primed T-cells and UM cells
To test the translatability of our predictions, we used a GMP-compliant procedure to obtain antigen-specific T cells through peptide stimulation of leukapheresis products.30 We prepared peptide mixtures such that each tier of twenty peptides was split into four pools of up to five peptides, yielding a total of 12 peptide pools, to curtail experimental demand. In the HE tier, peptides were assigned to the pools in descending order of ES. To blind experimental procedures, pools were labeled randomly and provided to the experimental team.
PBMCs from four HLA-A*02:01-positive and CMV-seropositive healthy blood donors were stimulated with one of the pools, CMV-pp65 peptides as positive control, or a negative control without peptide as shown in figure 6A.
Since IFN-γ production is a surrogate marker for antigen-specific T-cell activation, we measured the frequency of IFN-γ-secreting T cells within the PBMC culture by flow cytometry after 9 days of stimulation (representative plots in figure 6B). Despite considerable interdonor variation in the response, we observed a significant increase in IFN-γ-positive, antigen-specific T cells in pools HE1, HE4, AP3 and LE1 (figure 6C).
A hallmark of a protective T-cell response against tumors is direct tumoricidal activity.31–33 We therefore incubated peptide pool-expanded T cells with the HLA-A*02:01-expressing UM cell line 92.1 (online supplemental figure S5) and recorded killing events with live microscopy (figure 6B, for time-lapse recordings, see online supplemental mat). CMV-pp65 peptide expanded T cells were used as a positive control30 due to their known cross-reactivity with 92.1-expressed tyrosinase34 35 and CMV’s tissue tropism, which includes the choroidea.36 Consistent with their ability to secrete IFN-γ, expanded T cells from the HE4 showed pronounced cytotoxic activity against UM targets (figure 6D). In contrast, the same T cells did not exert cytotoxic activity against autologous macrophages (online supplemental figure S6), which suggests that HE tier-expanded T cells do not elicit measurable off-target effects. In summary, these data show that our ES prioritizes peptides that can mount an intended, specific T-cell response under in vitro conditions with little to no off-target effects on a non-tumor cell population.
As a step beyond healthy controls, we examined whether PBMCs isolated from patients with UM would react to our peptides. For this, we compared the IFN-γ concentrations in the culture supernatants of pool-stimulated PBMCs from healthy donors or patients with UM by ELISA (figure 6E). We observed a prominent reaction in patient with UM PBMCs that surpassed the one in PBMCs from healthy donors, in particular in the HE pools. Taken together, we show consistent evidence of antigen-specific T-cell activation in several experiments, with HE-tier pools more likely to be efficacious.
Figure 6F summarizes the experimental results obtained for all peptide pools. The heat map cells indicate how many peptides each prioritized gene (in rows) contributed to each pool (in columns), and the pools are annotated with their experimentally observed immunogenic activity (upper, purple-colored and red-colored bars). We also retrieved proteomics data for the UM 92.1 cells37 to annotate the prioritized genes with their experimentally observed protein expression (green-colored left-hand panel). A viable peptide pool for clinical translation should contain peptides coming from proteins demonstrably expressed by 92.1, which display high binding affinity and activate T cells in the functional experiments. Among the 12 peptide pools, only HE1 and HE4 fulfill all the requirements: they contain peptides coming from proteins expressed by 92.1 cells (TSPAN10, TYR and CABLES1), yielded high peptide binding affinity and showed cytokine secretion and cytotoxic activity against 92.1 cells.
Discussion
Aiming to discover tumor-type specific immunotherapy options, we here provide and validate an antigen selection algorithm for targeted therapies like autologous cell transfer or therapeutic antitumor vaccination. Our approach for MHC-I-restricted antigens explicitly optimizes for self-tolerance and anti-tumor immunogenicity at the selection level. Despite emerging roles for MHC II-restricted antigens in tumor control,38 we focused on the better-established CD8 cytotoxicity usually mediated via MHC I. While there are many MHC-I binding prediction algorithms that focus on the interaction of peptide and HLA allele, we show that a tumor entity-specific approach incorporating the relevant transcriptomic landscapes can yield promising results. We developed the methodology to select tolerable therapeutic peptides for metastatic UM, a cancer type with bleak prognosis, but our ES and validation pipeline can be applied to other cancer entities for which transcriptomic information is available. While our algorithm prioritized some well-known melanosomal TAAs like PMEL (gp100) and MLANA (Melan-A), we also found a TAA linked to paraneoplastic syndrome in neurological disease, the gene PNMA6A, with no established connection to UM.39–41 Additionally, we selected TRAPPC9, a gene whose expression has been linked to increased tumorigenesis and which is a potential prognostic marker in other cancers.42 43 This shows that our algorithm does not only prioritize genes that are UM-specific but can also single out candidates with a potential functional impact in cancer progression.
In experimental comparisons of our high-efficacy candidates for HLA-A*02:01 against alternative-predictor and low-efficacy controls, two of the high-efficacy (HE) pools induced an observable T-cell response. While T-cell stimulation with pool HE4 led to high IFN-γ production measured by both ELISA and flow cytometry as well as cytotoxic efficacy against a UM cell line, we observed considerable variation in response intensities across PBMC donors. We believe that one possible explanation for these discrepancies lies in the individual-specific availability of mildly autoreactive T-cell clones that are cognate for the selected antigens. These cells originate from stochastic failures in central tolerance, where some self-reactive T cells escape negative selection during T-cell maturation in the thymus,44 and their peripheral prevalence is one of the factors determining the strength of the effector response against tumor-associated autoantigens. We want to point out that despite computational efforts to restrict the each experimentally tested peptide’s binding partner to only one MHC-I allele, we cannot rule out binding promiscuity to other MHC-I alleles and its contribution to our experimental results.
Interestingly, pool LE1 demonstrated the overall strongest IFN-γ response despite consisting of peptides with a low predicted efficacy. This unexpected signal could stem from the way we intentionally biased the performance of the ensemble models predicting binding and activity: by strongly favoring positive predictive value, and in return accepting a high false-negative rate, we potentially wrongfully sort some good peptide candidates into the LE tier. Also, pool LE1 contains peptides coming from genes with low or zero indispensability index but displaying high values for the other ES subfunctions. For example, the LLLAACAPPPC peptide belongs to LE1 but clusters in figure 5 together with the HE-tier peptides, distinguished only by an indispensability index of zero.
Compared with other antigen and peptide selection algorithms, ours integrates two hypothesis-based considerations to avoid potentially life-threatening autoimmune reactions in non-tumor tissues: discarding genes with high expression in critical tissues and discarding peptides showing high sequence homology with non-selected antigens. We supplement these theoretical constraints with experimental validation not only of the selected antigen’s in vitro efficacy but also of its in vitro tolerability by cells of the immune compartment. In spite of advances in contemporary antigen selection algorithms, experimental validation is still an essential step.45 46 Further validation of our predictions would proceed with long-term and more involved experiments, where one would, for example, test if the gene indispensability accurately captures the uninterrupted expression of the prioritized genes under immune pressure, or if samples of survival-critical tissues are spared from the mounted immune reaction. One can ultimately also test different delivery methods, such as off-the-shelf constructs (e.g., mRNA vectors) or autologous cell therapy (eg, adoptive T-cell transfer), in further in vitro studies. Finally, scientific advances in the establishment of suitable animal models for UM would allow for in vivo tests.
The methodology presented in this study lends itself to the design of targeted anti-UM immunotherapies in both the non-personalized and the personalized setting. With contemporary turn-around times in RNA-sequencing and GMP-grade peptide synthesis, an individual patient’s tumor biopsy transcriptome can inform therapeutic decisions with only marginal delay. Also, large cohort studies of the UM transcriptome will potentially allow the identification of antigens and peptides that are simultaneously efficacious and tolerable in a high percentage of patients with UM. The recent approval of tebentafusp (tradename Kimmtrak),47 48 an MHC-restricted T-cell redirection agent cognate for a peptide from melanoma-associated gp100, reinvigorates the validity of TAAs. Predicting these TAAs correctly will make it possible to explore a wide range of therapeutic options. Among others, premanufactured libraries of peptides would bring the vision of off-the-shelf antigen therapies closer to reality (figure 1A).
Data availability statement
Data are available in a public, open access repository. Data are available upon reasonable request. The public transcriptomics data used to prioritize genes are available on the GDC Data Portal under the Project ID TCGA-UVM (https://portal.gdc.cancer.gov/projects/TCGA-UVM). All data (transcriptomics or otherwise) generated by this study are available upon reasonable request to the corresponding author.
Ethics statements
Patient consent for publication
Ethics approval
This study involves human participants and was approved by Medical Faculty Ethics Commission of the Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU). For uveal melanoma biopsy collection: Vote Processing Number (Bearbeitungsnummer) 12_2011. For PBMC collection from healthy donors: Vote Processing Number (Bearbeitungsnummer) 20-446-Br. Participants gave informed consent to participate in the study before taking part. Vote Processing Number (Bearbeitungsnummer) 22-249-Bp
Acknowledgments
We are grateful to Luca Musella for discussions on the manuscript that led to improvements. We are also indebted to the biopsied patients with UM and PBMC donors for their agreement to provide samples, and to the medical staff who collected and processed the samples.
References
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Footnotes
X @JulianFreen
CL and ME contributed equally.
Correction notice This article has been corrected since it was first published online. The affiliations of author Cindy Flamann have been updated to (3) BZKF, Erlangen, Germany and (4) Department of Hematology and Oncology, Universitätsklinikum Erlangen and FAU Erlangen-Nürnberg, Erlangen, Germany.
Contributors Conceptualisation: CL, ME, JV and GS. Methodology: CL, ME, JV, HB, CF, AWeich, AWessely, JB, EG, JR, JD, MW, JM and KPS. Investigation: CL, ME, CF, AWeich, AWessely, JB, EG, JD, MVH, NDvK, JJF-vH and AWT. Visualisation: CL, ME, CF, AWeich, JR, HB and JV. Funding acquisition: JV, HB, CB, MVH, JD, OW, GS, BS-T, EATK and NS. Sample acquisition: BH, HK, AWessely, MVH, JD, BS-T and EATK. Project administration: JV, HB, CB, MVH, SG, OW, GS and BS-T. Supervision: JV, HB, MVH, BS-T and SG. Writing – original draft: CL, ME and JV. Writing – review and editing: CL, ME, JV, HB, MVH, CB, CF, AWeich, AWessely, JR, JD, BS-T, HK, BH, EATK, NS and JJF-vH. JV is the guarantor.
Funding This research was funded by the Manfred-Roth-Stiftung and the intramural Forschungsstiftung Medizin am Universitätsklinikum Erlangen in a project on epitope prediction for uveal melanoma (to BS-T and JV), the Hiege-Stiftung in a project on CAR T-cell development (to NS and JV), the Bundesministerium für Bildung und Forschung (German Federal Ministry of Education and Research) in projects e:Med MelAutim (01ZX1905A to JV, 01ZX1905B to OW) and Kl-VesD (161L0244A and 16LW0338K to JV), and the European Union’s Horizon Europe Research and Innovation program under grant agreement number 101057250 (CANCERNA) (to NS, JD, JV). HB was supported by Deutsche Krebshilfe (German Cancer Aid) in grant 70114489 and Deutsche Forschungsgemeinschaft (German Research Foundation) as part of TRR 221 (B12, Projektnummer 324392634) and through research training group GRK2740 (Projektnummer 447268119). BST and GS acknowledge support from Deutsche Krebshilfe (German Cancer Aid) in grant 110182.
Competing interests CB reports personal fees and non-financial support from BMS, personal fees from MSD, personal fees from Novartis, personal fees from Leo Pharma, personal fees from Regeneron, personal fees from Immunocore, personal fees from Sanofi-Genzyme, personal fees from Sanofi-Aventis, personal fees from Almirall-Hermal, personal fees from Roche, personal fees from Pierre Fabre, from Merck, outside the submitted work. GS, NS and JD are named as inventors on a patent on DCs electroporated with caIKK RNA (WO/2012/055551), which is held by the Friedrich-Alexander-Universität Erlangen-Nürnberg. BS-T declares advisory board honoraria from lovance. MVH was supported by an Else Kröner Excellence Fellowship (2021_EKES.16) and received honoraria from Immunocore, MSD, BMS, Roche, Novartis, Sun Pharma, Sanofi, Almirall, Biofrontera, Galderma. JV reports personal fees from Novartis. All other authors declare no conflicts of interest.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.