Article Text

Original research
Germline homozygosity and allelic imbalance of HLA-I are common in esophagogastric adenocarcinoma and impair the repertoire of immunogenic peptides
  1. Maria Alejandra Garcia-Marquez1,2,
  2. Martin Thelen1,2,
  3. Eugen Bauer3,
  4. Lukas Maas4,
  5. Kerstin Wennhold1,2,
  6. Jonas Lehmann1,2,
  7. Diandra Keller1,2,
  8. Miloš Nikolić4,
  9. Julie George4,5,
  10. Thomas Zander6,
  11. Wolfgang Schröder2,
  12. Philipp Müller1,7,
  13. Ali M Yazbeck1,7,
  14. Christiane Bruns2,
  15. Roman Thomas4,7,8,
  16. Birgit Gathof3,
  17. Alexander Quaas7,
  18. Martin Peifer1,4,
  19. Axel M Hillmer1,7,
  20. Michael von Bergwelt-Baildon8,9,10 and
  21. Hans Anton Schlößer1,2
  1. 1Center for Molecular Medicine Cologne, University of Cologne, Cologne, Germany
  2. 2Department of General, Visceral, Cancer and Transplantation Surgery, University of Cologne, Cologne, Germany
  3. 3Institute of Transfusion Medicine, University of Cologne, Cologne, Germany
  4. 4Department of Translational Genomics, University of Cologne, Cologne, Germany
  5. 5Department of Otorhinolaryngology Head and Neck Surgery, University Hospital Cologne, Cologne, Germany
  6. 6Department I of Internal Medicine and Center for Integrated Oncology (CIO) Aachen Bonn Cologne Duesseldorf, University Hospital Cologne, Cologne, Germany
  7. 7Institute of Pathology, University of Cologne, Cologne, Germany
  8. 8German Cancer Consortium (DKTK), Heidelberg, Germany
  9. 9Gene Centre, Ludwig Maximilians University Munich, Munchen, Germany
  10. 10Department of Medicine III, Ludwig Maximilians University Munich, Munchen, Germany
  1. Correspondence to Dr Maria Alejandra Garcia-Marquez; maria.garcia-marquez{at}


Background The individual HLA-I genotype is associated with cancer, autoimmune diseases and infections. This study elucidates the role of germline homozygosity or allelic imbalance of HLA-I loci in esophago-gastric adenocarcinoma (EGA) and determines the resulting repertoires of potentially immunogenic peptides.

Methods HLA genotypes and sequences of either (1) 10 relevant tumor-associated antigens (TAAs) or (2) patient-specific mutation-associated neoantigens (MANAs) were used to predict good-affinity binders using an in silico approach for MHC-binding ( Imbalanced or lost expression of HLA-I-A/B/C alleles was analyzed by transcriptome sequencing. FluoroSpot assays and TCR sequencing were used to determine peptide-specific T-cell responses.

Results We show that germline homozygosity of HLA-I genes is significantly enriched in EGA patients (n=80) compared with an HLA-matched reference cohort (n=7605). Whereas the overall mutational burden is similar, the repertoire of potentially immunogenic peptides derived from TAAs and MANAs was lower in homozygous patients. Promiscuity of peptides binding to different HLA-I molecules was low for most TAAs and MANAs and in silico modeling of the homozygous to a heterozygous HLA genotype revealed normalized peptide repertoires. Transcriptome sequencing showed imbalanced expression of HLA-I alleles in 75% of heterozygous patients. Out of these, 33% showed complete loss of heterozygosity, whereas 66% had altered expression of only one or two HLA-I molecules. In a FluoroSpot assay, we determined that peptide-specific T-cell responses against NY-ESO-1 are derived from multiple peptides, which often exclusively bind only one HLA-I allele.

Conclusion The high frequency of germline homozygosity in EGA patients suggests reduced cancer immunosurveillance leading to an increased cancer risk. Therapeutic targeting of allelic imbalance of HLA-I molecules should be considered in EGA.

  • Antigen Presentation
  • Antigens
  • Computational Biology
  • Gastrointestinal Neoplasms
  • Tumor Escape

Data availability statement

All data relevant to the study are included in the article or uploaded as online supplemental information.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • Prevalence of autoimmune and infectious diseases is related to specific HLA alleles.

  • Maximal heterozygosity at HLA-I loci is associated with improved overall and progression-free survival of patients with cancer treated with immune-checkpoint inhibition.

  • In melanoma and non-small cell lung cancer (NSCLC), specific “HLA supertypes” with exceptionally good or poor prognosis have been identified.


  • The fraction of patients with germline HLA homozygosity is significantly increased in Caucasian esophagogastric adenocarcinoma patients.

  • The overall mutational burden is similar in non-microsatellite-instability patients with and without germline HLA-I homozygosity.

  • The repertoire of potentially immunogenic high and moderate-affinity binders derived from mutation-associated neoantigens as well as from shared tumor-associated antigens is significantly reduced in patients with germline HLA-I homozygosity.

  • High and moderate-affinity binders derived from tumor-associated antigens and neoantigens have low promiscuity and preferentially bind one particular HLA-I molecule.

  • In heterozygous patients, 75% showed allelic imbalance of at least one HLA-I molecule in the tumor. The resulting reduced repertoire of potentially immunogenic high and moderate-affinity binders reflects selective pressure mediated by peptides binding to the lost HLA-I allele.


  • The HLA genotype needs to be considered in design and evaluation of immunotherapeutic trials.

  • Therapies aiming to reduce epigenetic silencing of HLA-I molecules appear as promising addition to immune-checkpoint inhibition.


Classical human leucocyte antigen class I molecules (HLA-A, HLA-B, and HLA-C) are expressed by somatic cells and define the repertoire of peptides that can be recognized by T cells.1 HLA-I genes are codominantly expressed and a heterozygous individual can express up to six different molecules, two from each locus.1 The individual combination of HLA molecules and T-cell receptors (TCRs) defines responsiveness against cancer and infection or also predisposition to autoimmune diseases, of which some are more prevalent in individuals with certain HLA alleles.2–5 In the context of cancer, homozygosity in one or more HLA-I genes could translate in a smaller repertoire of peptides derived from tumor-associated antigens (TAAs) and mutation-associated neoantigens (MANAs), which can be presented to cytotoxic T cells. This potentially predisposes such HLA-I-homozygous individuals to a disadvantage to fight a nascent tumor.

Esophagogastric adenocarcinoma (EGA) is the sixth most common cause of cancer-related death worldwide with a rising incidence in Western countries. Although some progress has been made in the last years, the prognosis of EGA remains poor. Overall, the median survival of esophageal cancer is about 10 months with a 5-year survival rate of 22.0%.6 Immunotherapy is effective across a wide range of cancer types7 8 and appears as one of the most promising additional treatment options for EGA. Clinical trials with checkpoint inhibitors (CKI) targeting PD-1 demonstrated efficacy and lead to approval of Nivolumab and Pembrolizumab for the treatment of EGA.9 10 However, objective response rates are low, and most patients do not benefit from the treatment. Similar low response rates have been described for other types of cancer and mechanisms underlying primary or secondary resistance to immune-checkpoint inhibition are poorly understood.11 12

One important mechanism allowing cancer cells to evade the immune system is HLA molecule downregulation or loss of expression by cancer cells.13 Previous studies have described a relation between HLA expression on tumor cells and survival, prognosis or response to CKI.14 15 However, the effect of germline HLA homozygosity and allelic imbalance on the repertoire of antigens that can be presented to T cells is poorly described. Recent studies have investigated the link between low HLA diversity, survival and outcome after immunotherapy in non-small cell lung cancer (NSCLC) and melanoma.16–19 Results from these publications are contradictory and further studies are needed to appropriately address whether HLA class I zygosity affects the survival of cancer patients.

We present the first comprehensive HLA-class-I-genotyping of Caucasian EGA patients and describe an unexpectedly high rate of homozygosity. Moreover, we elucidate the impact of HLA homozygosity and imbalanced expression of HLA-I genes on the repertoire of immunogenic peptides in EGA patients.

Material and methods

Patients and samples

Tumor samples and peripheral blood mononuclear cells (PBMCs) from 80 previously untreated EGA patients and one healthy donor, all without a history of autoimmune disease or hepatitis were collected at the University Hospital of Cologne, Germany (online supplemental table 1). All patients underwent surgical resection of their tumor with D2 or extended lymphadenectomy. Tissue samples were placed into frozen storage tubes and preserved in liquid nitrogen until use. PBMCs from all donors were isolated as described in online supplemental methods.

Supplemental material

Supplemental material

Supplemental material

HLA typing

DNA from blood and fresh-frozen tumor tissue was used for genotyping with targeted-next generation sequencing method by Illumina using an in-house kit accredited by the national accreditation body of the Federal Republic of Germany (DAkkS) and the commission of experts for research and innovation (EFI) as described in online supplemental methods. Bone marrow donors were matched by selecting individuals that had identical genotypes for all three HLA-I genes compared with at least one of the 80 patients of this study.

Prediction of TAA-derived peptides

The canonical sequences of 10 TAAs frequently expressed in EGA (online supplemental table 3) that were selected based on expression frequencies in our own cohort and publicly available data (the cancer testis antigen database—– and The Cancer Genome Atlas (TCGA)-based GEPIA2 platform RRID:SCR_018294 were obtained from UniProt (RRID:SCR_004426, These TAA sequences were used in combination with results from the HLA genotyping to identify the patient-specific repertoire of HLA I-restricted 9–10 mers using the immune epitope database (IEDB) analysis resource NetMHCpan (V.4.1) tool (RRID:SCR_006604, on January 2021. Predicted good binding peptides (high and moderate affinity) were ranked based on percentile rank (PR) score ≤3.

Whole-exome sequencing

DNA isolated from fresh frozen tumor tissue and blood of patients with histologically determined tumor content higher than 60.0% (online supplemental table 4) was subjected to next generation whole-exome sequencing (WES) at Macrogen Europe (Amsterdam, Netherlands) (online supplemental methods). The single-nucleotide variants (SNVs) and small insertions and deletions (frameshift InDels) were then annotated with Ensembl Variant Effect Predictor (VEP, V.104.3).21 The downstream amino acid (AA) change of the frameshift InDels was annotated using “Downstream” plug-in, a special VEP plug-in. The VEP annotated transcripts were then matched to the Ensembl database and the relative AA sequences were extracted (R library, EnsDb.Hsapiens.V.75). The eight upstream AAs were substringed from the Ensembl sequence and concatenated to the annotated changed downstream AAs.

Prediction and identification of MANAs

SNVs and frameshift InDel mutations called from WES data were filtered by excluding mutations arising from genes usually not expressed in gastric or esophageal cancer and mucosa by using the web-based tool GEPIA2 (Gene Expression Profiling Interactive Analysis, RRID:SCR_018294 Genes with a median expression lower than three transcripts per million in tumor and normal samples were excluded. To predict neoantigens derived from SNVs, we used 15-mer peptides with the exchanged AA localized centrally. To predict neoantigens derived from frameshift InDels, the sequences including eight AA upstream followed by the insertion or deletion and until the next stop codon were used. The AA sequences in combination with the patient-specific HLA-I genotype were used to predict patient-specific repertoires of HLA I-restricted 9–10 mers peptides using the IEDB analysis resource NetMHCpan (V.4.1) tool (RRID:SCR_006604, on January 2021 (for SNVs) and on October 202322 (for frameshifts InDels). Predicted peptides were filtered based on their PR score ≤3 to identify high and moderate binders. Additionally, for MANAs selection (derived from SNVs), the difference in predicted MHC-I binding affinity between wild-type and corresponding mutant peptides (differential agretopicity index; DAI) was included as it may reflect relevant cancer peptide immunogenicity.23 Peptides with DAI<1 were excluded. A set of unique MANAs on a per patient basis was identified based on these criteria. Potentially immunogenic mutations were defined as those producing at least one high or moderate affinity neoantigen with DAI>1.

Measurement of mRNA expression of the HLA-A, HLA-B, and HLA-C genes

RNA isolated from fresh frozen tumor and normal tissue was subjected to transcriptome analysis at Macrogen Europe (Amsterdam, Netherlands). Library preparation was done using TruSeq mRNA kit (Agilent, USA) and sequencing was performed on a NovaSeq-6000 sequencer (Illumina, California, USA) to generate 2×150 bp and 30×106 read pairs per sample. RNA paired-end read files were used as input in FASTQ format for all patients for tumor and normal tissue and analyzed as described in online supplemental methods. Allele abundances in the tumor samples were compared with the allele abundances of the corresponding normal tissue samples (range 29.37%–70.63%). If one allele in the tumor had a transcript abundance higher than 70.63%, it was considered imbalanced.

Prediction of good-binding peptides derived from NY-ESO-1

43 overlapping sequences derived from NY-ESO-1 (15-mer with 11 AA overlap) were used in combination with results from the HLA genotype of a healthy donor (online supplemental table 5) reactive to a NY-ESO-1 peptide pool to identify the patient-specific repertoire of HLA I-restricted 9–10 mers with high and moderate binding affinity using the IEDB analysis resource NetMHCpan (V.4.1) tool (RRID:SCR_006604, on August 2022. Predicted peptides were filtered based on their PR score ≤3 to identify high and moderate binders.

FluoroSpot assay

For analysis of TAA-specific immune responses triplicates of 2×105 PBMCs from a healthy donor were co-cultured without peptides, with peptides derived from NY-ESO-1 (43 overlapping peptides—15 mers with 11 AA overlap) or with NY-ESO-1 peptide pool (1 µg of each peptide/mL, peptides&elephants, Germany, details in online supplemental table 5) on precoated anti-human interferon gamma (IFN-γ) FluoroSpot plates (Mabtech, Sweden) for 20 hours. For analysis of neoantigen-specific immune responses, 5×104 expanded T cells (details on in online supplemental methods) from patient #31 were co-cultured with autologous CD40 activated B cells (CD40Bs) in a 2:1 ratio without peptides, with individual neoantigens or with actin (1 µg of each peptide/mL, peptides&elephants, Germany, details in online supplemental table 5). Spot detection was performed on an AID iSpot Spectrum Reader (AID, Germany). To identify specific spots, the median number of spots of the negative control (cells co-cultured with anti-CD28 only) was subtracted from the median of cells co-cultured with the peptide of interest. Patients who showed ≥10 specific spots were counted as responder to the respective peptide condition.

Statistical analyses and figures

Applicable statistical analyses were performed using GraphPad V.8.3.0 (GraphPad Prism, USA) as indicated in the corresponding figure legends. Graphs were generated using GraphPad and figures were created using Inkscape V.1.0beta1 (Free Software Foundation, USA). RStudio (R V.4.0.3)24 was used to generate venn diagrams with eulerr package (V.6.1.1)25 and oncoplots with GenVisR (V.1.22.1).26 Non-parametric tests were used if one of the groups did not pass the D’Agostino Pearson omnibus-k2 test of normality. Group sizes, levels of statistical significance, definition of error bars and applied tests were included in figure legends.


Homozygosity of HLA-I loci is increased in Caucasian EGA patients

High-resolution HLA genotyping of germline DNA from EGA patients (n=80) revealed an unexpectedly high rate of HLA homozygosity (figure 1 and online supplemental table 2). The frequency of homozygosity in the EGA cohort was compared with an HLA-genotype-matched reference population obtained from the German bone marrow donor database (n=7605 out of 615,017 donors). Our analyses revealed germline homozygosity for at least one HLA-I locus in 35.0% of EGA patients compared with 19.1% in the HLA-matched general population (p<0.001) corresponding to an OR of 2.3 (95% CI 1.45 to 3.58). The frequency for homozygosity was also higher in EGA patients when looking at HLA-A (p<0.05, OR 1.9, 95% CI 1.11 to 3.24), HLA-B (p<0.01, OR 3.5, 95% CI 1.66 to 7.33) and HLA-C (p<0.01, OR 3.0, 95% CI 1.43 to 6.31) separately (figure 1). Five patients showed homozygosity for more than one HLA-I gene simultaneously. Moreover, when patients were stratified according to UICC stages in early (UICC I+II) versus advanced (UICC-III+IV) we observed that the fraction of patients with germline homozygosity was higher in advanced stages (21.43% UICC I+II vs 50.00% UICC III+IV, p<0.01) (online supplemental figure 1).

Supplemental material

Figure 1

Esophagogastric adenocarcinoma (EGA) patients exhibit bias toward homozygosity at human leucocyte antigen-I loci. Differences in the frequency of homozygosity in EGA patients (n=80) and HLA-matched healthy controls (HC, n=7615) were calculated with the one-sided Fisher’s exact test. Exact p values are given in the table. In the graph, the dot indicates the OR, and the width of the horizontal lines represents the 95% CI for each condition. Intervals of the OR were computed using the Baptista-Pike method. Significant differences are indicated by asterisks. *p≤0.05, **p≤0.01, ***p≤0.001.

The amount of potentially immunogenic peptides derived from TAAs is reduced in patients with germline HLA-I homozygosity

We followed an in silico approach to elucidate the impact of HLA-homozygosity on the repertoire of tumor-specific peptides binding to patient-matched HLA-I molecules. The IEDB analysis resource20 NetMHCpan tool was used to predict the immunopeptidome derived from 10 relevant TAAs in EGA (n=80). The 10 TAAs were selected based on expression in tumor samples and previously described TAA-specific immune responses in EGA.27 We focused on 9–10 mers, which are considered the most relevant peptide lengths for presentation on HLA-I molecules.28 According to the thresholds recommended by IEDB user workshop 2018 and Koşaloğlu-Yalçın et al,29 PR≤3 was used to categorize specific peptides as good affinity binders. Significantly less unique good binding peptides with PR≤3 (n=28; 1312.0±150.5) were predicted for the homozygous group compared with the heterozygous group (n=52; 1563.0±130.1, p<0.0001) (figure 2A). Peptides predicted to bind individual HLA-A (424.6±151.9 in homozygous vs 608.4±69.4 in heterozygous, p<0.0001), HLA-B (578.3±157.7 in homozygous vs 702.2±82.8 in heterozygous, p<0.0001) and HLA-C molecules (613.9±124.9 in homozygous vs 678.7±81.2 in heterozygous, p<0.01) were significantly reduced in the homozygous compared with the heterozygous cohort (figure 2B). We next examined whether the effect of homozygosity was due to a single HLA-I gene or a combination. We divided the patients in those with homozygosity in only one gene (HLA-A, HLA-B, or HLA-C) and those having homozygosity in more than one HLA-I gene simultaneously. This analysis revealed that patients who are homozygous for only HLA-A had a significant reduction of the number of predicted good binders (n=17; p<0.0001) (figure 2C). Moreover, patients having homozygosity for more than one HLA-I gene simultaneously (n=5) had the lowest number of good binding peptides (1146±124 in homozygous vs 1563±130.1 in heterozygous; p<0.0001).

Figure 2

Esophagogastric adenocarcinoma (EGA) patients with HLA homozygosity have a reduced repertoire of good binding peptides (percentile rank ≤3) derive from tumor-associated antigens (TAAs). (A) Number of total predicted good binding peptides derived from TAAs in homozygous (n=28) and heterozygous patients (n=52). (B) Number of total predicted good binding peptides derived from TAAs that bind each individual HLA-I locus in homozygous (n=28) and heterozygous patients (n=52). (C) Number of total predicted good binding peptides derived from TAAs in patients who are homozygous at one allele only (HLA-A n=17, HLA-B n=3, HLA-C n=3) or at more than one allele simultaneously (n=5) compared with heterozygous patients (n=52). (D) Overlay of all predicted peptides binding the three HLA-I loci in homozygous (n=28) and heterozygous patients (n=52). Venn diagrams show peptides binding only to HLA-A (black), HLA-B (dark gray), HLA-C (red), HLA-A and B (light gray), HLA-A and C (yellow), HLA-C and B (orange) or HLA-A, HLA-B and HLA-C (green) in homozygous and heterozygous patients. The numbers in each category represent the mean of predicted peptides in each cohort binding to each HLA-A, HLA-B and HLA-C or the combination. (E) Percentage of good binding peptides predicted to be presented in more than one HLA class I molecule in homozygous (n=28) and heterozygous patients (n=52). (F) In silico modeling of the size of repertoires of predicted good binding peptides derived from TAAs in homozygous patients (n=28) with real and artificial genotypes and heterozygous patients (n=52). Significant differences calculated with one-tailed Mann-Whitney test (A, E), two-way ANOVA with Sidak’s multiple comparison (B), Kruskal-Wallis test with Dunn’s correction (C, F) are indicated by asterisks. *p≤0.05, **p≤0.01, ****p≤0.0001. When appropriate, mean±SD is indicated. ANOVA, analysis of variance.

The degree of HLA-I binding promiscuity of TAA-derived peptides was lower in EGA patients with HLA homozygosity

The peptide repertoire presented on an HLA class I molecule is determined by the structure of the peptide binding groove. Hence, molecules having similar grooves might present similar or even the same peptides. To evaluate the extent of promiscuity among predicted HLA-I ligands derived from TAAs, we determined predicted peptides that bind more than one HLA-I molecule (HLA-A, HLA-B, or HLA-C) in heterozygous and homozygous patients. This analysis confirmed a low overlap of peptides binding more than one HLA-I molecule in the homozygous cohort (figure 2D). Moreover, the promiscuity defined as the percentage of good binding peptides predicted to be presented by more than one HLA-I molecule was higher in heterozygous (19.2%±4.7%) than in homozygous patients (17.3%±4.3%) (figure 2E, p<0.05).

In silico modeling of HLA heterozygosity leads to a higher amount of predicted TAA-derived good binders

Based on the results described above, lack of HLA-I alleles is related to a restricted, less diverse repertoire of TAA-derived good binders. Therefore, a set of nine most frequent alleles (three for each HLA-A, HLA-B, and HLA-C) were selected from the healthy population of bone marrow donors (online supplemental table 6) and were included in the homozygous HLA-I-locus of each patient for in silico reversion of homozygosity. With these artificial genotypes and the canonical sequences of the TAAs, the prediction by IEDB20 was repeated. The fraction of unique good binders (PR≤3) binding the HLA-I artificial genotypes was calculated and compared with the fraction of unique good binders (PR≤3) binding the patient-specific HLA-I genotype (real genotype) and to the fraction of good binders in the heterozygous cohort (figure 2F). We observed a recovery of predicted good binding peptides (1312 peptides±150.5 homozygous real genotype vs 1531 peptides±124.9 artificial genotype p<0.0001 and 1563 peptides±130.1 in the heterozygous cohort, p=0.8337).

Mutational landscape of EGA patients

Next to TAAs, MANAs, generated by non-synonymous mutations (NsM) in the tumor-cell genome are important targets of antigen-specific T-cell responses.30 To elucidate the effect of homozygosity on the repertoire of MANAs, we focused the analysis on 38 tumors without microsatellite-instability (non-MSI) (online supplemental figure 2A) and described the five hypermutated tumors (tumor mutational burden—TMB-high-) separately, because their very large number of mutations would bias our analysis (online supplemental figure 2B). In total, 5855 mutations were identified using WES analysis (online supplemental figure 3A) of which 3810 were SNVs (100.2±46.9 per patient, range 33–198, n=38), 1505 were silent mutations (39.6±18.5 per patient, range 11–74, n=38), 207 were nonsense mutations (5.4±3.2 per patient, range 1–11, n=38), 157 frameshift mutations (4.1±2.6 per patient, range 1–12, n=38), 98 splice mutations (2.6±2.0 per patient, range 1–8, n=32), 67 in-frame mutations (1.8±1.7 per patient, range 1–9, n=30), 6 intron-exon (2.0±2.0 per patient, range 1–6, n=6) and five non-stop mutations (1.0 per patient, n=5). The mutational landscape of the non-MSI patients (n=38) revealed TP53 (76.3%) and CDKN2A (15.8%) as the two most frequently mutated tumor-suppressor genes (figure 3). In addition, mutations in the mucine family were common (87.5%) with MUC19 (36.8%) being the most common mutation. When stratifying the non-MSI patients in homozygous and heterozygous for HLA genes, the profiling of genomic alterations showed no significant variation (data not shown). Analysis of the TMB revealed no differences between homozygous (3.9±1.6 mut/Mb, n=16) and heterozygous patients (3.7±1.8 mut/Mb, n=22, p=0.237) (figure 4A).

Supplemental material

Supplemental material

Figure 3

Tumor mutational landscape. Waterfall plot displaying the landscapes of the top 50 most frequently mutated genes identified in esophagogastric adenocarcinoma samples (n=38). Genes are ordered according to their mutation frequency within this cohort (synonymous mutations marked in red and non-synonymous in blue). Mutation types are color-coded as indicated.

Figure 4

Esophagogastric adenocarcinoma (EGA) patients with HLA homozygosity have reduced repertoires of MANAs. (A) Tumor mutational burden (TMB) in homozygous (n=16) and heterozygous patients (n=22) represented by bar charts. (B) Number of potentially immunogenic mutations that produce neoantigens in homozygous (n=16) and heterozygous EGA patients (n=22). (C) Number of predicted unique neoantigens calculated per expressed single nucleotide variant (SNV) in homozygous (n=16) and heterozygous EGA patients (n=22). (D) Number of potentially immunogenic mutations normalized to expressed SNV that produce neoantigens binding each individual HLA allele in homozygous (n=16) and heterozygous patients (n=22). (E) Number of predicted neoantigens normalized to expressed SNV binding each individual HLA allele in homozygous (n=16) and heterozygous patients (n=22). In silico modeling of the size of repertoires of predicted neoantigens derived from expressed SNV in (F) homozygous patients with real, artificial and most common genotypes (n=16), (G) homozygous patients with artificial genotype (n=16) compared with heterozygous patients with real genotype (n=22), (H) heterozygous patients with real genotype and most common genotype (n=22). (I) Overlay of all predicted peptides among the three HLA-I loci in homozygous (n=16) and heterozygous patients (n=22). Venn diagrams show peptides binding only to HLA-A (black), HLA-B (dark gray), HLA-C (red), HLA-A and B (light gray), HLA-A and C (yellow), HLA-C and B (orange) or HLA-A, HLA-B and HLA-C (green) in homozygous patients with real, artificial and most common genotype and heterozygous patients with real genotype. The numbers inside the circles represent the mean of predicted peptides in each cohort binding to each particular HLA-A, HLA-B and HLA-C or the combination. Significant differences calculated with one-tailed Mann-Whitney test (A–C, G), two-way ANOVA with Sidak’s multiple comparison (D, E), Friedman test with Dunn’s multiple comparison test (F) and one-tailed Wilcoxon matched-paired test (H) are indicated by asterisks. *p≤0.05, **p≤0.01, ****p≤0.0001. When appropriate, mean±SD is indicated. ANOVA, analysis of variance; MANAs, mutation-associated neoantigens.

Reduced amount of potentially immunogenic neoantigens in EGA patients with HLA homozygosity

For the analyses of MANAs, we focused on SNVs, as these are the most frequent mutations and hence the primary determinant of TMB (online supplemental figure 3A). In addition, we also included frameshift InDels as a relevance of neoantigens derived from these mutations has been described in cancer.31 Potentially immunogenic MANAs were predicted based on correction for mutations located in genes, which are likely not expressed in esophageal tissue based on GEPIA2. From the 3803 SNVs originally detected, 1793 expressed according to GEPIA2 (47.1% of the originally detected mutations) were entered with their corresponding HLA genotypes in the IEDB20 to identify good binding MANAs (9–10 mers, PR≤3). The PR of the corresponding wild-type peptides was used to estimate the affinity difference for any given non-mutated/mutant peptide pair (agretopicity index, DAI)23 as an indicator of neoantigen dissimilarity and immunogenicity. Only 18.0% of the identified SNVs (686/3803) resulted in potentially immunogenic MANAs (PR≤3, DAI>1). While the number of potentially immunogenic mutations was similar in homozygous (range 8–39, 17.3±8.3) and heterozygous patients (range 5–38, 18.6±10.2) (figure 4B), the number of neoantigens per expressed SNV was significantly lower in the homozygous cohort (0.6±0.1 vs 0.7±0.2 in heterozygous patients, p<0.01) (figure 4C). Similarly, the number of potentially immunogenic SNVs that gave rise to peptides binding HLA-A (0.11±0.08 in homozygous vs 0.19±0.06 in heterozygous, p<0.01) and HLA-B (0.15±0.05 in homozygous vs 0.20±0.07 in heterozygous, p<0.05) was significantly lower in the homozygous cohort (figure 4D). This resulted in less predicted neoantigens per expressed SNV binding to HLA-A (0.17±0.13 in homozygous vs 0.27±0.09 in heterozygous, p<0.01) and HLA-B (0.22±0.09 in homozygous vs 0.31±0.13 in heterozygous, p<0.05) in the homozygous cohort (figure 4E).

Similarly, 157 detected frameshifts were corrected for mutations located in genes, which are likely not expressed in gastroesophageal tissue based on GEPIA2. Stratification into zygosity showed no differences in the number of frameshifts detected (online supplemental figure 3B). The amino-acid sequences of the remaining 101 GEPIA-expressed frameshifts (60 frameshift deletions and 41 frameshift insertions in 36 patients) were used in combination with the HLA genotype to identify frameshift-derived neoantigens. In total, 1064 high and moderate affinity frameshift-derived neoantigens (PR≤3) were predicted. The number of frameshift-derived neoantigens was higher in the heterozygous group (online supplemental figure 3C, D), but this difference was only significant when the data were normalized to the number of mutations per patient (6.9±7.8 in homozygous vs 13.5±14.0 in heterozygous, p<0.05) (online supplemental figure 3D).

Artificial restoration of heterozygosity in homozygous patients leads to a normalized neoantigen repertoire with low promiscuity of neoantigens to HLA-I proteins irrespective of the genotype

To strengthen our hypothesis that the reduced neoantigen burden is due to homozygosity and not related to different patterns of mutations, we recalculated neoantigen repertoires on in silico reversion of homozygosity as described above. Neoantigen prediction for homozygous patients was repeated using the artificial genotypes and the set of GEPIA2-filtered SNVs and frameshifts. Predicted good binding peptides with PR≤3 were classified as MANAs and the resulting repertoires were compared with those obtained with the real genotype and a reference genotype, which contained the first and second most common alleles for each HLA-I gene (most common genotype, online supplemental table 6). For SNVs, the mean number of neoantigens increased from 28.6 predicted with the real genotype to 36.3 with the artificial heterozygous genotype (figure 4F; p<0.01). Compared with the most common genotype the difference was even larger (37.4; p<0.001). The neoantigen repertoire predicted with the artificial genotype was similar to that predicted with the most common genotype (figure 4F) and also to the repertoire of the heterozygous cohort (figure 4G). For frameshifts, neither restoration of homozygosity nor the use of the most common genotype modified the size of the repertoire of frameshift-derived neoantigens observed in the homozygous cohort (online supplemental figure 3D). Interestingly, comparison of the predicted neoantigen repertoires of heterozygous patients using the real or the most common genotype also revealed an increase in the number of predicted neoantigens both derived from SNVs (figure 4H, p<0.01) and frameshifts (online supplemental figure 3E), p<0.05).

In figure 4I, the overlay of all predicted neoantigens binding the three HLA-I proteins in all genotype groups is depicted. Accordingly, 90.0% of the predicted neoantigens are exclusively binding one of the analyzed HLA genes and there is almost no overlap among the sets. We calculated the percentage of neoantigens with binding promiscuity and observed similar fractions in the homozygous cohort predicted with (1) real (11.0%±10.4%), (2) artificial (12.5%±12.2%) and (3) most common genotype (10.2%±11.7%) and the heterozygous cohort (11.3±9.4) (online supplemental figure 4A). This is also true, when comparing inter-allele promiscuity of neoantigens to the different alleles of a specific HLA-I gene in heterozygous patients. Specifically, 96.8% of HLA-A binding, 94.5% of HLA-B binding and 85.3% of HLA-C binding neoantigens were predicted to bind exclusively to one of the two alleles (online supplemental figure 4B).

Supplemental material

Allelic imbalance of HLA-I genes is common in EGA

Impaired expression of HLA-A, HLA-B and HLA-C determined by immunohistochemistry has been described in several cancer types and is associated with an inferior prognosis.32 Genomic alterations account for only a small fraction of impaired HLA-I expression (2/38 cases detected in our cohort, data not shown). Transcriptional suppression is another mechanism associated with impaired expression of HLA-I.33 We performed transcriptome sequencing of tumors from the 19 patients where quality and quantity of RNA was sufficient to assess allele-specific expression of HLA-I genes. Allelic imbalance was detected in 11 patients (figure 5A, patient numbers marked in green). Three out of 12 heterozygous patients showed combined allelic imbalance of HLA-A, HLA-B and HLA-C suggesting loss of heterozygosity (LOH) in their tumors (Patients 8, 15 and 30). Six out of 12 patients showed allelic imbalance for one or two HLA-I genes and only 3/12 were balanced for all alleles. 65% of the heterozygous patients showed imbalanced expression of at least one HLA-I gene (figure 5B, p=0.0240). None of the patients with HLA-I allelic imbalance carried mutations within HLA-I genes.

Figure 5

HLA-allelic imbalance is common in esophagogastric adenocarcinoma. (A) HLA-allelic imbalance was calculated in each HLA-I family (A–C) by comparing allele abundances in the tumor to the allele abundances of the corresponding normal tissue in 19 patients. Allele abundances >70.63% were considered imbalanced (red, patients numbers marked in green). Balanced allele frequencies (range 29.37%–70.63%) are depicted in gray (patients numbers marked in black). Patients with germline homozygosity for which only one allele with 100% abundance was detected are marked with an asterisk in the corresponding allele (red). Numbers of patients with combined allelic imbalance of HLA-A, HLA-B and HLA-C are marked with green and an asterisk. (B) Proportion of homozygous (black) or heterozygous (gray) patients who are balanced or imbalanced for at least one HLA-allele. Number of total predicted good binding peptides derived from (C) tumor-associated antigens (TAAs) or (D) somatic mutations calculated with the germline HLA genotype and with the expressed HLA alleles found in the tumor samples of patients with allelic imbalance (n=11, 2 homozygous, 6 heterozygous and three heterozygous with microsatellite instability (MSI)). Significant differences calculated with one-sided χ2 test (B) and one-tailed Wilcoxon matched-paired test (C, D) are indicated by asterisks. *p≤0.05, ***p≤0.001.

The results highlight that despite having a complete set of HLA-I alleles on the germline level, we observed allelic imbalance in tumor samples of the majority of EGA patients. This loss of HLA-alleles results in a reduced repertoire of good-binding predicted peptides recalculated with the expressed HLA genotype derived from TAA (figure 5C, p=0.0005) and MANAs (figure 5D, p=0.0005) compared with the germline HLA genotype.

TAA-specific T-cell responses are polyclonal and related to HLA-I alleles

In a previous publication, we showed that endogenous cellular immune-responses against NY-ESO-1 are common in EGA.27 To evaluate if these responses against NY-ESO-1 are polyclonal and how they are shaped by allelic variation of HLA-I genes (online supplemental table 5), we used PBMCs from a NY-ESO-1 responsive donor with enough cells to perform this experiment. 43 overlapping 15-mer peptides derived from the NY-ESO-1 protein were used to predict high-affinity 9–10-mer peptides binding HLA molecules of the donor (figure 6A). Out of the 43 overlapping 15-mer peptides, 32 (74.4%) contained 9–10 mers peptides predicted to bind the donor’s HLA-I molecules with high affinity. Individual testing of the 43 overlapping 15-mer sequences of NY-ESO-1 in FluoroSpot assays revealed IFN-γ responses against 16/43 (37.2%) peptides. Of these responses, 81.2% (13/16) were against 15 mers predicted to generate 9–10 mers peptides binding to the HLA-molecules of the donor with high affinity. A peptide pool containing all 43 sequences was used in addition to the two positive controls (CEF and CD3) and demonstrates the strong endogenous IFN-γ response against the NY-ESO-1 peptide pool in this donor (figure 6B,C). A fraction of the peptides (36.2%) were predicted to bind to a single HLA-I molecule of the donor (figure 6D).

Figure 6

Reactivity to the NY-ESO-1 peptide pool is derived from responses to multiple peptides with distinct affinities to HLA-I alleles (A) The HLA genotype of a healthy donor was used for analyses of 43 overlapping peptides (15-mer with 11 amino acid (AA) overlap) derived from NY-ESO-1 to predict binding affinities. Dot plot shows all predicted peptides per sequence with color-coded HLA alleles for peptides with a percentile rank (PR)≤3. Heatmap shows in silico predicted (9–10 mers peptides PR≤3) and experimentally detected (IFN-γ) responses for the indicated 15-mer sequences. (B) NY-ESO-1-specific IFN-γ secretion was assessed by FluoroSpot assay in triplicates. Representative FluoroSpot pictures after co-culture of peripheral blood mononuclear cells (PBMCs) from a healthy donor without peptide ((−) peptides), with 43 overlapping peptides (15-mer with 11 aa overlap) derived from NY-ESO-1 (peptides 4 and 41 are shown as examples) or with the NY-ESO-1 peptide pool. A peptide pool of Cytomegalovirus, Epstein-Barr virus, and influenza virus (CEF) was used as biological positive control, while an antibody against CD3 was used as technical positive control. (C) The numbers of IFN-γ spots per 2×105 cells in FluoroSpot analysis are shown (mean±SD). (D) Pie chart showing the fraction of peptides that are exclusively presented by each color-coded HLA allele. In black is shown the fraction of peptides that are presented by more than one HLA allele. (E) Heatmap showing abundance of T-cell receptor clonotypes after neoantigen-specific expansion of T cells from patient #31. Significant differences calculated with two-tailed multiple unpaired t-test are indicated by asterisks. ***p≤0.001, ****p≤0.0001.

To further evaluate the connection between HLA homozygosity and T-cell responses, we assessed the presence of T-cell responses targeting personalized candidate SNV-derived neoantigens in cancer patient #31. To this end, in vitro expanded T cells isolated from peripheral blood were co-cultured with autologous CD40Bs as APCs pulsed with the synthetic peptides identified through WES. T-cell reactivity was tested by clonotype expansion and IFN-γ release assessed by TCR sequencing and FluoroSpot, respectively. We detected T-cell reactivity to peptide 2 (GRGTGGSTGDADGPG) and peptide 3 (GGSTGDADGPGGPGI) (online supplemental figure 5A, B) and observed peptide-specific clonal expansion in each condition (figure 6E). Interestingly, the two peptides that caused the observed IFN-γ responses were peptides predicted to bind to an HLA allele with reduced expression in the tumor sample (online supplemental figure 5A, HLA-B*15:01).

Supplemental material


Our study is the first detailed analysis of germline HLA-I homozygosity and imbalanced expression of HLA-I genes and their effect on the repertoire of potentially immunogenic peptides in EGA. Individuals who are heterozygous at all HLA genes have a greater repertoire of immunogenic peptides than homozygous patients34 and our research suggests that carrying a heterozygous HLA genotype is associated with superior cancer immunosurveillance. UICC stages of patients with germline homozygosity were higher in our cohort, but further investigation in larger cohorts with long-term prospective follow-up are needed to determine the impact of germline-homozygosity of HLA-I on the prognosis of EGA patients.

We performed a state-of-the-art HLA genotyping in accordance with internationally accepted standards of procedure for bone marrow donation, where accurate HLA genotyping is inevitable. The applied targeted sequencing approach is more accurate than HLA-untargeted sequencing tools, which require additional filtering algorithms to extract the HLA-specific reads and lack appropriate reference annotations.35 HLA genotyping from whole-genome sequencing is also challenging and not as accurate as the clinical-grade HLA genotyping used in our study. Low sequence coverage36 and low accuracy of HLA-genotyping tools37 limit applicability of the TCGA database or other large and well-annotated cohorts for the topic of our study. Our prospective analysis demonstrates an increased rate of germline homozygosity for HLA-I alleles in EGA patients. Shah et al38 described increased HLA homozygosity in patients with chronic lymphocytic leukemia (CLL) and their report was the first to correlate the clinical progression of a non-infectious neoplasm with HLA homozygosity. The rate of homozygosity for any HLA-I allele in CLL patients was 14.0% in comparison to 11.9% among the control population. Similarly, ORs of 1.3 and 1.5 have been described for HLA-I homozygosity in non-Hodgkin lymphomas (NHL).38 With an OR of 2.3 (35.0% vs 19.0%) the difference observed in our cohort was more pronounced than in CLL and NHL patients. Moreover, homozygosity was significantly increased for HLA-A, HLA-B, and HLA-C separately, which was not described for CLL. Of note, the two studies in hematological cancers compared frequencies of homozygosity to the overall population, whereas we used an HLA-genotype-matched control cohort with a relatively high rate of germline homozygosity. The difference would be even higher, when comparing to the HLA-unmatched general population. In one of the few previous publications addressing germline homozygosity, Liu and Hildesheim39 used data from genome-wide association studies and did not observe increased homozygosity in non-virus-associated solid tumors. However, EGA samples were not included in this analysis.

Homozygosity of HLA-I molecules is associated with susceptibility and clinical course of viral and bacterial infectious diseases.2–4 In cancer, downregulation or loss of one or more HLA molecules and/or defective antigen presentation in tumor cells mediate immune escape and resistance to CKI.13 Chowell et al16 analyzed a cohort of 1535 advanced melanoma and NSCLC patients and described that maximal heterozygosity at HLA-I loci was related to improved overall survival after CKI. Similarly, Abed et al18 reported a shorter overall survival for individuals with HLA-I homozygosity in a cohort of 170 advanced NSCLC patients, which was particularly relevant for patients with tumors expressing PD-L1 (≥50%). Others described a lower impact of HLA-homozygosity on susceptibility to CKI.17 The restricted repertoire of tumor antigens that are presented in the tumor microenvironment of HLA-homozygous cancer patients in our cohort could be of similar importance in the context of CKI. The high rate of HLA-homozygosity observed in our cohort suggests an inherited increased risk for EGA and could explain the low objective response rates to anti-PD-1 therapy in this cancer type.10

According to the cancer immunoediting hypothesis proposed by Dunn et al,40 tumor-specific immune response and cancer cells underlie a continuous reciprocal shaping. Hence, the more diverse the HLA haplotype, the higher the selective pressure elicited by antitumor immune responses on the mutational landscape and the neoantigen burden.41 McGranahan et al13 described an increased mutational burden and neoantigen load in lung adenocarcinoma patients with acquired LOH. However, we did not observe differences in the overall mutational burden when stratifying the non-MSI patients according to zygosity of HLA-I genes. This result was unexpected as germline homozygosity should be associated with a generally higher selective pressure. However, the onset of homozygosity in the McGranahan cohort is unknown and the selective pressure on tumor cells may have been increased for longer periods in patients with later tumor stages. As the majority of NsM are non-immunogenic, the size of our cohort may be too small to detect differences in mutational burden. A larger cohort would be needed to allow consideration of defective antigen presentation or other confounders of cancer immunosurveillance.

Contrary to what has been described in the context of infectious diseases, where a high fraction of HLA class I ligands (>60%) can bind to two or more HLA molecules simultaneously,42 we observed a rather low promiscuity of antigenic peptides derived from TAAs and MANAs, further stressing the importance of HLA diversity in cancer patients. Our transcriptomic analysis showed that most heterozygous patients carry allelic imbalance in their tumors which limits the repertoire of potentially immunogenic peptides. Imbalanced expression of HLA-I genes may avoid both natural killer attack and recognition by cytotoxic T cells, as described in colorectal cancer.43

Imbalanced expression of HLA-I affects tumor-specific immune responses against shared and private antigens. Our study exemplified the impact of one lost HLA-I-allele on the repertoire of TAA-specific T-cell responses using analyses of peptide-specific responses in an NY-ESO-1 responsive individual. T-cell responses against MANAs were less abundant, but following peptide-specific expansion we observed neoantigen-specific responses against a peptide binding to an HLA-I allele, which showed decreased expression in tumor samples. Woldemeskel et al used a similar expansion protocol to evaluate T-cell responses to common-cold coronaviruses or SARS-CoV-2 in individuals, which did not have contact to SARS-CoV-2 (seronegative).44 As this experiment almost exclusively induced responses to peptides of the common-cold variant, the MANA-specific T-cell responses in our study most-likely represent expansion of pre-existing low-abundance T-cell responses. However, de novo induction is also possible.

None of the patients with allelic imbalance of HLA-I genes carried mutations within HLA-I genes and only a small fraction of them acquired LOH in their tumors. This suggests that the imbalance of HLA-I genes observed in the majority of included patients might be reversible. Accordingly, pharmacological agents, which aim to induce upregulation of HLA class I genes (eg, IFN-γ45 or 5-aza-2′-deoxycytidine),46 appear promising to enhance susceptibility of EGA to immunotherapy.

Our study has some limitations. EGA is a rare disease and as approximately 80.0% of patients are treated with neoadjuvant chemotherapy or chemoradiotherapy, the number of available high-quality samples from previously untreated patients allowing complex ex vivo sequencing experiments was limited. The quantity of patient-derived material also does not allow HLA-I immunopeptidomics, which would be complementary to our data. Processes such as expression of immune-regulatory molecules, infiltration by immune-inhibitory bystander cells or defective antigen processing also play an important role and need to be considered as additional mechanisms of immune escape.

In conclusion, our study demonstrates an increased rate of germline HLA-I homozygosity and allelic imbalance among EGA patients. We show that HLA-homozygosity is associated with significantly smaller repertoires of potentially immunogenic peptides derived from TAAs and MANAs. Our data suggest that patients with HLA-homozygosity have poor cancer immunosurveillance and consequently an increased susceptibility to EGA. The study has important implications for monitoring of immune responses in clinical trials in EGA.

Data availability statement

All data relevant to the study are included in the article or uploaded as online supplemental information.

Ethics statements

Patient consent for publication

Ethics approval

This study involves human participants and the project was approved by our local ethics committee (no. 16-317) and informed consent was obtained from all patients. Participants gave informed consent to participate in the study before taking part.


We thank the Institute of Transfusion Medicine of the University Hospital Cologne and the Stefan Morsch Foundation for their support. We thank the technicians Monika Keiten-Schmitz and Alina Manu for their support. We furthermore thank the Regional Computing Center of the University of Cologne (RRZK) for providing computing time on the DFG-funded (Funding number: INST 216/512/1FUGG) High Performance Computing (HPC) system CHEOPS as well for their support.


Supplementary materials


  • X @mariale_gm25

  • Contributors MAG-M and HAS designed the research program and wrote the manuscript; MAG-M, MT, EB, KW, JL, DK, LM, MN, PM, AMY and AQ, conducted experiments, acquired, and analyzed data; JG, TZ, WS, CB, RT, BG, AQ, AMH, MvB-B, MP and HAS provided reagents and facilities, recruited patients, gathered the clinical data for the study and made conceptual contributions to the study; HAS supervised the project, guarantor. All authors have read and approved the final manuscript.

  • Funding Research reported in this article was supported by the German Research Foundation (No. 325827080 and No. 418074181).

  • Competing interests MvB-B: Honoraria for advisory boards, for invited talks from BMS and financial support for research projects from Astellas, Roche and MSD. HAS: Financial support for research projects from Astra Zeneca. All other authors declare no conflicts of interest.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.