Article Text
Abstract
Background Approximately half of the human genome is derived from repetitive sequences that can copy themselves to new genomic loci. Typically silenced in healthy somatic tissues but de-repressed in tumors, expression of these elements, referred to as the ‘repeatome’, has been linked to inflammation and genome instability. Peptides derived from repetitive elements have been discussed as potential cancer vaccine candidates, with the added benefit that immune escape cannot be achieved through the silencing of a single locus. However, accurate identification of which repeat transcripts are expressed in a tumor specific manner has remained a challenge.
Methods ROME has developed a state-of-the-art ‘repeatomics’ machine learning platform (‘ROMEQuant™’) designed to quantify the expression of highly repetitive elements with substantially improved sensitivity and specificity over currently available algorithms. We used ROMEQuant to profile the repeatome of large-scale, publicly available tumor and normal RNA-seq data (TCGA, GTEx) and additionally performed western blot analysis on a set of esophageal tumor, normal-adjacent to tumor (NAT), and normal esophagus samples (N = 5 each; N = 15 total) to validate protein expression of a candidate repeat protein. We reanalyzed public immunopeptidomics data to confirm antigen presentation.
Results Leveraging ROMEQuant’s accurate quantification of repeat protein coding loci, we identified over 100 putatively tumor specific peptides by RNA-seq and by immunopeptidomics from repetitive elements. These peptides are predicted to cover HLA alleles in >80% of the TCGA population and include multiple classes of repeat proteins. We selected one repeat protein with near-universal expression in esophageal cancer for further experimental validation in a set of tumor, NAT, and normal esophagus samples. We found its protein expression to be universally high in the tumor samples, absent in nine of ten non-tumor samples, and minimally expressed in the tenth. Reanalysis of immunopeptidomics data from a published study of esophageal cancer identified peptides derived from this target presented on five of seven tumors.
Conclusions Many repetitive elements are specifically expressed in tumors, making them promising cancer vaccine targets. However, their repetitiveness makes their expression difficult to quantify. ROME has developed tools that make it possible to accurately quantify repeat element expression and identify promising immunotherapy targets derived from repetitive elements. Here, we demonstrate the ability of the platform to identify repeat-derived, tumor-specific peptides that are presented by a range of HLA alleles and may be suitable for the development of a population cancer vaccine.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See http://creativecommons.org/licenses/by-nc/4.0/.