Article Text
Abstract
Background Human Leukocyte Antigen (HLA) genes are critical for the presentation of neoantigens to the immune system by cancer cells. Deletion of HLA alleles, known as HLA loss of heterozygosity (LOH), has been highlighted as a key immune escape mechanism. Validated algorithms to detect HLA LOH from sequencing data are critical for exploring the biological impact of HLA LOH and assessing its utility as a clinical biomarker.
Methods We developed DASH (Deletion of Allele-Specific HLAs), a machine learning algorithm trained on data from 279 patients on the ImmunoID NeXT Platform using features that account for probe capture variability between alleles and incorporate information from the regions flanking each HLA gene. To understand the contribution of boosted sequencing in the HLA region of the ImmunoID NeXT Platform, we performed an in silico downsampling analysis. To assess DASH’s performance at variable tumor purities and HLA LOH subclonalities we identified three tumor-normal cell lines with HLA LOH and created in silico mixtures. Furthermore, after designing patient-specific primers for 21 patients that target specific alleles, we applied digital PCR (dPCR) to validate the HLA allele copy number status of the patients. Finally, we applied DASH to 611 patients spanning 15 tumor types.
Results In cross validation analyses across patient samples, DASH achieved 98.7% specificity and 92.9% sensitivity while LOHHLA, a widely used algorithm, only reached 94.3% and 78.8%, respectively (figure 1). Downsampling analyses demonstrated that DASH benefits significantly from the boosted HLA sequencing on the ImmunoID NeXT Platform, dropping 0.06 in F-score after downsampling to the sequencing depth of other exome platforms. In cell line mixture analyses, DASH demonstrates greater than 99% specificity across all tumor purity and sub-clonality levels and greater than 98% sensitivity for above 27% tumor purity. Moreover, DASH demonstrated 100% sensitivity and specificity in dPCR experiments across 21 tumor samples with stable controls. We applied DASH to a large pan-cancer cohort and found that 18% of patients had HLA LOH (figure 2). We identified strong associations between HLA LOH and genomic instability. Moreover, we demonstrated relationships between HLA LOH and markers of immune pressure, such as a correlation with CD274 (PD-1) expression and allele-specific neoantigen enrichment for deleted HLA alleles.
Conclusions DASH, a highly sensitive HLA LOH algorithm that has been extensively validated using cross validation, in silico downsampling, cell line mixtures and dPCR, has demonstrated the widespread impact of HLA LOH in a large pan-cancer cohort.