Article Text

Original research
MER4 endogenous retrovirus correlated with better efficacy of anti-PD1/PD-L1 therapy in non-small cell lung cancer
  1. Julie Lecuelle1,2,3,4,
  2. Laure Favier5,
  3. Cléa Fraisse5,
  4. Aurélie Lagrange5,
  5. Coureche Kaderbhai5,
  6. Romain Boidot6,
  7. Sandy Chevrier6,
  8. Philippe Joubert7,
  9. Bertrand Routy8,9,
  10. Caroline Truntzer1,2,3,4 and
  11. Francois Ghiringhelli1,2,3,4,5
  1. 1Platform of Transfer in Biological Oncology, Georges-Francois Leclerc Cancer Center - UNICANCER, Dijon, Bourgogne-Franche-Comté, France
  2. 2UMR INSERM 1231, Dijon, Bourgogne-Franche-Comté, France
  3. 3Genomic and Immunotherapy Medical Institute, Dijon University Hospital, Dijon, Bourgogne-Franche-Comté, France
  4. 4University of Burgundy-Franche Comté, Dijon, Bourgogne-Franche-Comté, France
  5. 5Departmnt of Medical Oncology, Georges François Leclerc Cancer Center - UNICANCER, Dijon, Bourgogne-Franche-Comté, France
  6. 6Department of Biopathology, Georges François Leclerc Cancer Center - UNICANCER, Dijon, Bourgogne-Franche-Comté, France
  7. 7Department of Pathology, Quebec Heart and Lung Institute Research Center, Quebec City, Quebec, Canada
  8. 8Department of Medicine Montréal, Division of Oncology, Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CRCHUM), Montreal, Quebec, Canada
  9. 9Division of Hematology-Oncology, Centre Hospitalier de l'Université de Montréal (CHUM), Quebec City, Quebec, Canada
  1. Correspondence to Dr Francois Ghiringhelli; fghiringhelli{at}


Background Endogenous retroviruses (ERVs) are highly expressed in various cancer types and are associated with increased innate immune response and better efficacy of antiprogrammed death-1/ligand-1 (anti-PD1/PD-L1)-directed immune checkpoint inhibitors (ICI) in preclinical models. However, their role in human non-small cell lung cancer (NSCLC) remains unknown.

Methods We conducted a retrospective study of patients receiving ICI for advanced NSCLC in two independent cohorts. ERV expression was determined by RNA sequencing. The primary endpoint was progression-free survival (PFS) under ICI. The secondary endpoint was overall survival (OS) from ICI initiation. We studied expression of 6205 ERVs. Multivariate Cox regression model with lasso penalty was estimated on the training set to select ERVs significantly associated with survival. The predictive power of these ERVs was compared with that of previously described transcriptomic signatures.

Results We studied two independent cohorts of 89 and 70 patients, used as training and validation sets. Clinicopathological characteristics included 75% of patients with non-squamous NSCLC. We selected four ERVs significantly associated with PFS. Only high MER4 ERV was associated with better PFS and OS in both cohorts. From a biological point of view, high MER4 expression is associated with higher infiltration of eosinophils and inflammatory gene signatures, while low MER4 expression is associated with enrichment in metabolism and proliferation signatures. Adding MER4 to previously described transcriptomic signatures of response to ICI improved their predictive power.

Conclusions MER4 ERV expression is useful to stratify risk and predict PFS and OS in patients treated with ICI for NSCLC. It also improves the predictive power of other known transcriptomic signatures.

  • immunotherapy
  • biomarkers, tumor
  • biostatistics

Data availability statement

Data are available on reasonable request.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Lung cancer is the most common form of cancer globally.1 The majority of lung cancers are classified as non-small cell lung cancer (NSCLC).2 In patients without targetable mutations, the main treatments rely on cytotoxic chemotherapies and checkpoint inhibitors, used either concomitantly or sequentially.3 In this context of tumor without oncogenic addiction, immunotherapy and especially, drugs targeting antiprogrammed death-1/ligand-1 (PD-1/PD-L1) immune checkpoint inhibitors (ICI) have changed the field of thoracic oncology treatment. Initially, patients with advanced NSCLC were treated with chemotherapy followed by anti-PD-1 monoclonal antibodies (mAb). Currently, ICI are becoming standard in the first-line treatment of advanced NSCLC as monotherapy for patients with PD-L1 expression above 50%, and in association with chemotherapy for all-comers with NSCLC.4–6 Despite good efficacy, a majority of patients experience primary or secondary resistance, and availability of efficient predictive biomarkers of response to ICI remain an unmet clinical need. Although a large number of biological studies have tested complex biomarkers, the only approved biomarker remains PD-L1 immunohistochemistry. In addition to PD-L1, various genomics and transcriptomics biomarkers have been developed, such as tumor mutational burden or various transcriptomics signatures.7–9

About 9% of the human genome contains endogenous retrovirus (ERV) sequences.10 Usually, their expression is silenced in most somatic tissues due to epigenetic control. In contrast, reports have suggested that some ERVs are transcribed in various cancer types.11 Expression of ERVs is classically associated with induction of inflammatory responses, and epigenetic modifier drugs were shown to trigger re-expression of ERVs with better efficacy of ICIs in preclinical models.11 12 In renal carcinoma, an association between expression of some ERVs and both tumor immune signatures and ICI efficacy was recently described.13 Taken together, these data raise the hypothesis that abnormal expression of immunogenic ERVs may elicit an antitumor immune response that could render the tumor more likely to benefit from ICI blockade. However, specific data are lacking for NSCLC.

In this study, we aimed to determine the predictive role of ERV mRNA expression in NSCLC treated with ICI. We also evaluated the ability of ERV analysis to improve the predictive capacity of classical transcriptomics immune signatures.

Materials and methods

Study population

We disposed of two cohorts of patients with NSCLC receiving treatment with anti-programmed death 1 (PD-1/PD-L1) checkpoint inhibitors between 2014 and 2020. Cohort 1 was composed of 89 patients treated in the Georges François Leclerc Cancer center or the University Hospital of Dijon or the Hospital of Montréal and cohort 2 was composed of 70 patients treated in the Georges François Leclerc Cancer center. For all patients, abundance of transcripts from RNA-seq data was available, sequenced on different platforms using different technology for mRNA isolation (ribosome depletion for cohort 1 and polyA purification for cohort 2).

Only patients from whom informed consent was obtained and recorded in the medical chart were included in this retrospective study.

For RNAseq analysis, this study falls within the scope of the biological collection authorization registered under the number AC-2014–2260.

PD-L1 expression analysis

PD-L1 protein expression in tumor cells was assessed using immunohistochemistry with a ready-to-use PDL1 commercial kit with using 22C3 antibodies (22C3 pharmDX; Agilent, Santa Clara, California, USA) or PD-L1 concentrate Ab clone QR1 (Diagomics) or clone SP142 (Ventana). PD-L1 positivity was defined as >1% of cells in tumor. In this study, we just considered PD-L1 as positive or negative, we chose not to consider this variable as continuous.

LIPI score calculation

Blood cell counts and lactate dehygrogenase (LDH) levels at baseline before ICI treatment (within 30 days prior to the first treatment) were obtained from the electronic medical records. Demographic, clinical, pathological, and molecular data were also collected.

The LIPI (Lung Immune Prognostic Index) was developed on the basis of the derived neutrophil/lymphocyte ratio (dNLR = leukocytes/(leukocytes – neutrophils)>3)14 or LDH>230 U/L (considered as the upper limit of normal in our center). We considered two distinct groups: negative if neither of these two conditions was met, and positive if one or both conditions were met.

RNAseq data

For cohort 1, total RNA was extracted from formalin-fixed paraffin-embedded (FFPE) tumor slices (5 × 5 µm) using the Maxwell 16 LEV RNA FFPE Purification kit (Promega) following the manufacturer’s instructions. Libraries were prepared from 12 µL of total RNA with the TruSeq Stranded Total RNA using Ribo-Zero (Illumina) following the manufacturer’s instructions. Once qualified, paired-end libraries were sequenced using 2 × 75 bp output on a NextSeq 500 device (Illumina).

For cohort 2, 10 ng of total RNA or 40 ng from FFPE were fragmented and 3’-end of mRNA were captured using an RT primer poly(T) containing a unique molecular identifier. Illumina adapters were added during this step by template switching. Fragments were then amplified with two PCR to complete Illumina adapters and add indexing sequences. Libraries were sequenced on NovaSeq 6000 platform.

Abundance of transcripts was quantified using the Kallisto program.15 This program is based on pseudoalignment for rapidly determining the compatibility of reads with targets, without the need for alignment. The Kallisto transcript index used as reference was built from merged human cDNA and ncDNA files from the GRCh37 assembly ENSEMBL. Gene-level count matrices were then created with the DESeq2 library. Low-count genes were prefiltered by removing genes with too few reads.16

Genes differentially expressed according to MER4 expression were selected using the DESeq2 R package;16 cohorts were pooled and batch effect was accounted for by adding cohort as a covariate in the regression model. Gene set enrichment analysis (GSEA)17 was performed on resulting differential genes using Hallmarks of cancer gene sets from the Broad Institute and the R package ‘clusterProfiler’.18

Detection of ERVs

ERV sequences were detected using Telescope software.19 Only ERV sequences detected with more than five reads and in more than two patients were considered. With this filtering, 6205 ERVs common to both cohorts were retained for analysis.

Immune cell signatures

The abundance of 10 tissue-infiltrating immune and stromal cell populations (CD3+T cells, CD8+T cells, cytotoxic lymphocytes, NK (Natural Killer) cells, B lymphocytes, monocytic lineage, myeloid dendritic cells (DCs), neutrophils, endothelial cells, and fibroblasts) were estimated using transcriptomic profiles and the microenvironment cell population-counter (MCP-counter) method20. This analysis was performed on variance-stabilized RNAseq data. Batch effect due to different platforms was removed using removeBatchEffect() function from the limma R package.

Twenty-eight immune subpopulations were evaluated using single-sample gene set enrichment analysis (ssGSEA)21 based on gene signatures following the methodology proposed by Charoentong et al,22 including major types related to adaptive immunity: activated T cells, central memory (Tcm), effector memory (Tem) CD4+ and CD8+T cells, gamma delta T (Tgd) cells, T helper 1 (Th1) cells, Th2 cells, Th17 cells, regulatory T cells (Treg), follicular helper T cells (Tfh), activated, immature, and memory B cells, as well as cell types related to innate immunity, such as macrophages, monocytes, mast cells, eosinophils, neutrophils, activated, plasmacytoid, and immature DCs. ssGSEA analysis was performed on log2(TPM+1) as advised. To take into account the platform effect, z-scores were computed on log2(TPM+1) and batch effect was removed on this transformed data using the ComBat function from sva R package.

Transcriptomic signatures

Six transcriptomic signatures were computed as a metagene by taking the mean expression of corresponding genes. Expression was based on variance-stabilized RNAseq data and batch effect due to different platforms was removed as described above. Three signatures are related to IFN pathways or checkpoint inhibitors: IFNy, extended immune gene signature (EIG), and T cell-inflamed gene expression profile (GEP) and four are related to T cell immune infiltrate: cytotoxicity (CYTOX), TH1 orientation (TH1), cytotoxic lymphocytes (CTL), and CD274 gene expression (online supplemental table 1).23 24

Supplemental material

Statistical analysis

Patient characteristics were compared by group of origin (cohort 1 or cohort 2) using the χ² or Fisher’s exact test for qualitative variables, and the Wilcoxon test for continuous variables, as appropriate. P values were adjusted using Benjamini-Hochberg FDR correction and adjusted p<0.05 were considered significant.

Progression-free-survival (PFS) was calculated from the date of first immunotherapy administration until disease progression or death from any cause, and was evaluated at 6 months. Patients who were alive with no progression at 6 months were censored. Overall survival (OS) was calculated from the date of first immunotherapy administration until death from any cause and was censored at 1 year.

Survival analysis was performed using the survival R library. The prognostic value of the different variables was tested using univariate or adjusted multivariate Cox regression with lasso penalty for PFS and OS. Survival probabilities were estimated using the Kaplan-Meier method and survival curves were compared using the log-rank test. P values less than or equal to 0.05 were considered statistically significant. Nested models were compared using the likelihood ratio test (LRT) and the area under the curve (AUC).

Statistical analyses were performed using the R software ( and graphs were drawn using GraphPad Prism V.9.0.2.


Patient characteristics

In total, 159 patients treated with ICI (in first or further lines) for metastatic NSCLC were retained for analysis in two independent cohorts sequenced on two different platforms. Median age was 66 years (IQR=14), and 75% of patients had non-squamous NSCLC. One hundred and forty-one patients (89%) received anti-PD-1 therapies (nivolumab or pembrolizumab) while the others received atezolizumab or durvalumab. PD-L1 status was available for 73 patients (46%), and PD-L1 expression was detected as positive in 57 of these patients (78%). Clinical characteristics did not differ between both cohorts. The main characteristics of the population are reported in online supplemental table 2.

Supplemental material

Association between ERVs and ICI efficacy

Median number of ERVs was 764 (IQR=76) for cohort 1 and 115 (IQR=130) for cohort 2. Median ERV overall expression (total number of reads detected for each patient) was 4634 (IQR=484) in cohort 1 and 940 (IQR=780) in cohort 2. Neither total number of ERVs detected per patient nor overall ERV expression was associated with progression-free survival (PFS), or with OS (online supplemental table 3).

Supplemental material

To investigate further, we searched for ERVs whose expression was associated with PFS using cohort 1 as a training set and cohort 2 as a validation set. A first selection was performed using univariate Cox models in cohort 1 on 6205 ERVs common to both cohorts; 464 ERVs were found to be associated with PFS. After selection by multivariate Cox regression with lasso penalty, four selected ERVs (MER4 6p22-3c, LTR19 9q34-11, HERVFRD 3p21-31, and ERVLE 4q33-a) were found to be associated with PFS in cohort 1 by univariate analysis (table 1). Using cohort 2 as a validation set, we found that only high expression of MER4 6p22-3c ERV remained associated with better PFS by univariate Cox regression (table 1). Using the third quartile of this ERV’s expression in cohort 1 as a cut-off we observed that high MER4 6p22-3c ERV expression was associated with better PFS in both cohorts (HR=0.3 (0.2 to 0.7), p=0.006 in cohort 1 and HR=0.4 (0.2 to 0.9), p=0.02 in cohort 2) (figure 1A,B). Moreover, high MER4 6p22-3c ERV expression was associated with better OS in both cohorts (HR=0.6 (0.2 to 1.3), p=0.16 and HR=0.3 (0.1 to 1), p=0.04, respectively, for cohorts 1 and 2) (figure 1C,D).

Figure 1

Association between survival and MER4 6p22-3c ERV expression. Kaplan-Meier curves with patients stratified according to MER4 6p22-3c ERV expression (low vs high) for progression-free survival (A) in cohort 1, (B) in cohort 2 and for overall survival (C) in cohort 1 and (D) in cohort 2. ERV, endogenous retrovirus.

Table 1

Univariate Cox model for PFS and OS in cohorts 1 and 2 for ERVs selected by multivariate Cox regression model with lasso penalty

Next, we evaluated whether MER4 6p22-3c was an independent prognostic marker independently of clinical variables. In the following analyses, MER4 expression was dichotomized based on third quartile expression in cohort 1 and patients were considered as MER4Low or MER4High, according to whether they were, respectively, below or above the third quartile. As both cohorts had comparable clinical characteristics, they were pooled for this analysis (table 2). We observed that high MER4 6p22-3c ERV expression was associated with better PFS (as continuous variable: HR=0.8 (0.7 to 0.9), p<1×10−3, as binary variable: HR=0.3 (0.2 to 0.6), p<1×10−3) and OS (as continuous variable: HR=0.8 (0.7 to 0.9), p<1×10−3, as binary variable: HR=0.4 (0.2 to 0.8), p=0.01). Similarly, MER4 6p22-3c ERV expression taken as a continuous variable is associated with disease control rate (p=0.01). Poor performance status and use of ICI after the first line were not associated with poorer PFS, while only poor performance status was associated with poorer OS (HR=2.9 (1.3 to 6.2), p=0.007). Only the presence of bone metastasis (HR=2.1 (1.2 to 3.5), p=0.006), a high dNLR score (HR=1.2 (1 to 1.5), p=0.01) and a LIPI score greater than one (HR=2.2 (1.2 to 1.5), p=0.008) were associated with poorer PFS. These variables were also associated with poorer OS (table 2). Using a multivariate Cox model including LIPI score, bone metastasis, and MER4 expression, all three variables remained significantly associated with PFS, with HR=2.5 (1.4 to 4.4) (p=0.001) for presence of bone metastasis, HR=2.3 (1.3 to 4.1) (p=0.005) for LIPI score and HR=0.35 (0.13 to 0.99) (p=0.05) for MER4High expression.

Table 2

Univariate and multivariate Cox models for PFS and OS including clinical variables in pooled cohorts 1 and 2

Clinical characteristics were compared between MER4High and MER4Low patients. No significant difference was observed (online supplemental table 4).

Supplemental material

We next evaluated the incremental predictive capacity of MER4 expression relative to PD-L1 status, as this latter marker is classically used as an anti PD-1/anti PD-L1 mAb biomarker in clinical practice. As PD-L1 status based on immunohistochemistry staining was missing for a large proportion of patients, PD-L1 status was inferred from CD274 gene expression based on RNA sequencing.23 Using the first quartile of CD274 expression as a cut-off we observed that high expression of this gene was associated with better PFS (HR=0.5 (0.3 to 0.8), p=0.002) (figure 2A) and better OS (HR=0.5 (0.3 to 0.9), p=0.02) (online supplemental figure 1A). Also weak, correlation between CD274 and MER4 expression was significant (Pearson coefficient=0.2; p=0.01). A multivariate model including CD274 mRNA expression level and MER4 6p22-3c expression (dichotomized according to the third quartile expression in cohort 1) was also estimated. MER4 status had an incremental predictive value compared with PD-L1 information alone in terms of PFS (AUC=0.7 vs 0.6, LRT p<1×10−3) and OS (AUC=0.64 vs 0.57, LRT p=0.002) (figure 2B and online supplemental figure 1B). Using CD274 and MER4 expression, we were able to separate patients into four groups on the basis of their relative expression levels: CD274Low/MER4Low (n=29), CD274Low/MER4High (n=8), CD274High/MER4Low (n=77) and CD274High/MER4High (n=25). Patients who were classified CD274High/MER4High had the best PFS: median survival was not reached vs 1.6 (1.4;2.6) months in the CD274Low/MER4Low group (HR=0.1 (0.1 to 0.3), p<1×10−3); 3.8 (2.7 to not reached) months for CD274Low/MER4High group (HR=0.4 (0.1 to 1.2), p=0.11); and 2.8 (2.3 to 4.9) months for CD274High/MER4Low group (HR=0.3 (0.1 to 0.6), p=0.001) (figure 2C). Similar results were observed for OS (online supplemental figure 1C).

Supplemental material

Figure 2

Association between survival and CD274 expression. (A) Kaplan-Meier curves with patients stratified according to CD274 gene expression (low vs high) for progression-free survival. (B) Barplots of time-dependent AUC for CD274, MER4 and combined (CD274 and MER4) models for progression-free survival. P<0.01 are represented by a double stars and p<1×10−3 by three stars. (C) Kaplan-Meier curves with patients stratified according to CD274 gene and MER4 ERV expression for progression-free survival. AUC, area under the curve.

MER4 6p22-3c ERV expression is thus an independent predictive and prognostic marker in patients with NSCLC treated by ICI monotherapy and is associated with better outcome.

Characterization of transcriptomic immune parameters related to MER4 expression

To better explain the link between MER4 expression and response to ICI, we analyzed the relation between MER4 and transcriptomic features of tumors. Using differential gene expression analysis (figure 3A) for MER4 expression, we observed that 311 genes were significantly upregulated in the MER4High group, and 330 genes were significantly downregulated, using a Benjamini-Hochberg adjusted p<0.05 and absolute log fold-change >1 as filters.

Figure 3

Transcriptomic description related to MER4 6p22-3c ERV expression. (A) Volcano plot displaying differentially expressed genes, given MER4 ERV expression. The vertical axis (y-axis) corresponds to the mean expression value of log 10 of adjusted p value using Benjamini-Hochberg FDR correction, and the horizontal axis (x-axis) displays the log 2 fold change value. Red dots on the right (or left, respectively) are genes significantly upregulated (or downregulated) in patients with high MER4 ERV expression. (B) Barplots of all signaling pathways found using GSEA, enriched in high MER4 ERV expression on the right, and enriched in low MER4 ERV on the left. P values were adjusted using Benjamini-Hochberg FDR correction and adjusted p<0.05 were considered significant and are represented by an orange bar. (C,D) Boxplots showing immune infiltration related to MER4 ERV expression evaluated with (C) MCP counter and (D) Charoentong methodology. P values were adjusted using Benjamini-Hochberg FDR correction and adjusted p<0.1 are represented by a star and adjusted p<0.05 by a double stars. ERV, endogenous retrovirus; FC, fold change; FDR, false discovery rate; GSEA, gene set enrichment analysis; MCP, microenvironment cell populations.

GSEA using Hallmarks of cancers as gene sets showed enrichment of metabolic (Glycolysis), Myc related, E2F or G2M checkpoint related pathways in the MER4Low group, and immune and inflammatory related pathways (TNF, UV and inflammatory pathways) in MER4High group (figure 3B)

To explore further, we compared immune infiltration between MER4High and MER4Low patients. Based on MCP-counter analysis, no significant difference was observed (figure 3C). Using immunophenoscore signatures as reported by Charoentong et al,22 we observed accumulation of mast cells (adjusted p=0.04), eosinophils (adjusted p=0.04), macrophages (adjusted p=0.04), regulatory T cells (adjusted p=0.04), and T follicular helper cells (adjusted p=0.04) in MER4High patients (figure 3D), thus suggesting better immune reactivity in such tumors.

Together, these results underline that MER4 expression was associated with an inflammatory phenotype and accumulation of innate and adaptive T cells.

MER4 adds predictive power to transcriptomic signatures

Previous studies have underlined that transcriptomic signatures related to IFN pathways or checkpoint inhibitors could be used to predict the efficacy of checkpoint inhibitors in various cancer types.7 9 Accordingly, we tested in our series the predictive role of IFN, extended immune gene signature (EIG), and T cell-inflamed gene expression profile (GEP) (online supplemental table 1).24 Each signature, considered as a continuous variable, was significantly associated with better PFS by univariate Cox analysis, yielding AUCs between 0.62 and 0.65 (online supplemental table 5 and online supplemental figure 2A). We also used the signature of T cell accumulation in the tumor, which we previously showed to be associated with better response to ICI.23 Accordingly, we computed gene signatures, respectively, associated with Th1 response, cytotoxic response, and presence of CD8 T cells (online supplemental table 1). Each signature was significantly associated with better PFS by univariate Cox analysis, yielding AUCs between 0.62 and 0.67 (online supplemental table 5). Similar results were observed for OS (online supplemental table 5 and online supplemental figure 2B).

Supplemental material

Supplemental material

To test the capacity of the ERV model to improve prognostic prediction, we generated models that combined MER4 expression with each previously described signature. Comparison of the models using the LRT showed that MER4 improved the predictive power of all gene signatures (online supplemental table 5 and online supplemental figure 2A). Similar results were observed for OS (online supplemental table 5 and online supplemental figure 2B).

Using the best statistical model for PFS, which included CTL signature and MER4 expression, we separated patients into four groups using the median as the cut-off for the CTL signature. High expression of MER4 6p22.3c was associated with improved PFS. Patients classified as CTLLow/MER4High had a significantly better PFS than patients CTLLow/MER4Low (median follow-up not reached (3.2 to not reached) vs 1.8 (1.6;2.6) months, HR=0.2 (0.1 to 0.5), p<1×10−3), while CTLHigh/MER4High patients seemed to have a better PFS than CTLHigh/MER4Low patients (median follow-up not reached (4.3 to not reached) vs 4.9 (2.7 to not reached) months, HR=0.5 (0.2 to 1.1), p=0.09) (online supplemental figure 2C). Similar results were observed for OS. Patients classified CTLLow/MER4High had better OS than CTLLow/MER4Low patients (median follow-up not reached vs 5.9 (3.9;10.2) months, HR=0.1 (0.03 to 0.6), p=0.007) (online supplemental figure 2D).


This study provides novel insights into the role of MER4 ERV in predicting response to anti PD-1 therapy in NSCLC treated with anti PD-1/PD-L1.

Classically, it is thought that the accumulation of all ERVs could trigger inflammatory response. This article underlines that the type of ERVs might be more important than the global number of ERVs. Such data suggest the hypothesis that viral mimicry triggered by specific ERVs could promote immune response to ICIs.

Human ERVs derive from ancestral exogenous retroviruses, whose genetic material is integrated in human germline DNA. ERV sequences in humans account for about 9% of the genome.25 Human ERVs are grouped into three classes, from I to III, based on similarity with the exogenous Gammaretrovirus, Betaretrovirus, and Spumavirus, respectively. Until recently, transcriptomic characterization of ERVs using RNA-seq was complicated by uncertainty in fragment assignment. To address this issue, we used the recent software Telescope.19 This software provides accurate estimation of ERV expression resolved to specific genomic locations.

ERVs were previously shown to affect cancers by various direct effects. ERVs are frequently more expressed in cancer cells than in normal cells. Such re-expression is frequently related to hypomethylation of DNA, often found in cancers that promote ERV transcription.26–28 This reactivation can influence tumor genome stability.28–31 Some ERVs can induce oncogenesis via their capacity to promote oncogene expression.32 For example, ERV-derived genes can activate the MAPK pathway, a classical oncogenic pathway.33 Some ERVs related to Syncytin mediation of cell fusion can also promote cell fusion in cancer cells, a process known to be related to cancer progression, metastasis, and chemoresistance.34 35 In addition, ERVs are enriched in long noncoding RNA (lncRNA) exons.36 37 These ERV-derived lncRNAs represent around 10% of the genome of human ERVs.38 Such lncRNAs play an important role in regulated expression of genes, and using this pathway, ERVs could inhibit some oncogenes or antioncogenes, thus regulating cancer biology. In our study, using GSEA analysis, we report that MER4 ERV expression is inversely associated with upregulation of Myc and E2F target genes, which are classically known to be related to the epithelial-mesenchymal transition process.39 40 Similarly, higher expression of glycolysis, mTORC1, DNA repair, and G2M checkpoint pathway in MER4low tumor indicates more proliferative tumors. Our results raise the hypothesis that MER4 could negatively regulate the epithelial-mesenchymal transition process and tumor cell proliferation.

ERVs could also have a major role in immune response. First, ERV proteins could code for tumor associated neoantigens.41 Second, their mRNA is able to impact on both innate and adaptive immune responses via various mechanisms. The mRNA of ERVs could be detected by RIG-I-like and TLR3.42–44 RIG-I receptors mediate activation of the adaptor molecule called mitochondrial antiviral signaling protein, thus leading to activation of type I IFN signaling pathway via the activation of IFN regulatory factors 3 (IRF3), and induce nuclear factor κB (NF-kB) activation.45 TLR3 induces activation of Type I IFN and CXCL9/CXCL10 production via the activation of IRF1. In addition Type I IFN promotes cancer cell immunogenicity by inducing the expression of class I MHC on tumor cells, thus enhancing T cell adaptive immune response.46 In contrast, some reports underline that ERVs could harbor immunosuppressive functions. For example, ERVs could decrease production of IL2 and CXCL9 by immune cells. ERVs contain a sequence called immunosuppressive domain, which could modulate the activation of immune cells.47 Syncytins are known to protect the fetus from the mother’s immune system and ERVs can use this system to help cancer to fight against the host immune system.48–50 In our study, not the overall number or expression of ERVs is associated with outcome but rather only MER4 expression. This suggests that some ERVs may have both positive and negative effects on cancer growth and immune response. While we observed a greater signature of inflammatory immune response, and high accumulation of both adaptive and innate immune cells in the MER4high group, these data suggest a positive immune effect of this particular ERV. In metastatic renal cell carcinoma, ERVE-4 HERV expression was associated with increased disease control rat and longer PFS in nivolumab treated patients but not in everolimus treated patients.51 The latter generalizes the role of HERV in cancer treated with immunotherapy and support a prediction rather than a prognostic role.

Combination of epigenetic drug plus ICIs is an emerging field and shows promising results in clinical trials.52 53 Recent studies have identified chromatin regulators with cell-intrinsic effects on the immune sensitivity of cancer cells, raising the possibility that epigenetic therapies could enhance efficacy of ICIs. Double-stranded RNAs can accumulate in cancer cells on derepression of ERVs by epigenetic drugs such as DNA demethylating agents and lysine demethylase 1.54 It has also be shown that H3K9 methyltransferase SETDB1 derepress retroviral sequences and are control response to ICIs.55 Such data support the rational that epigenetic drug could influence immune response by targeting ERVs

MER4 is a member of the ‘HEPSI’ supergroup, which are related to Class I (Epsilonretroviruses, Gammaretroviruses). However, MER4 classification remains complex because these proviruses are highly defective and often do not yield a Pol protein.56 Very few data are available on the biological role of this virus and our report suggests that it might be important in tuning the antitumor immune response in NSCLC. Interestingly MERK4 is not correlated with other classical predictive factors, thus suggesting that it is a new independent predictive factor.

This study has some limitations, related to its retrospective design and the small number of patients. However, the analysis of two independent series with different technologies gives strength to our observations. Our study involved patients treated in second or further lines with anti-PD-1/PD-L1. Currently, these drugs are used in first line alone or with chemotherapy, so further data on this type of patient are required. Because we could not test patients not treated with immunotherapy, we could not determine the predictive versus prognostic role of this signature. Finally, additional mechanistic demonstrations are required using genetic invalidation of MER4 in human NSCLC cell lines to better explain the mechanism of action of this ERV in NSCLC biology.

To conclude, our study provides novel insights into the role of MER4 as a predictive marker of response to checkpoint inhibitors in NSCLC. It provides evidence that the addition of MER4 to immune transcriptomic signatures could be used to improve prediction.

Data availability statement

Data are available on reasonable request.

Ethics statements

Patient consent for publication

Ethics approval

The study was conducted according to the guidelines of the Declaration of Helsinki and European legislation and approved by the CNIL (French national commission for data privacy) and the Georges François Leclerc Cancer Center (Dijon, France) local ethics committee (13.085). Participants gave informed consent to participate in the study before taking part.


The authors would like to thank Integragen (Genopole Campus 1, Evry) for the generation of sequencing data, and Laurence Zitvogel (Institut Gustave Roussy, Paris) for her contribution to the analysis of these data. The authors would also like to thank Dr Fiona Ecarnot (EA3920, University of Franche-Comté, Besançon) for English correction and helpful comments.


Supplementary materials


  • Twitter @boidot_romain

  • CT and FG contributed equally.

  • Contributors Conceptualization: CT, FG. Methodology: JL, CT. Validation: CT, FG. Formal analysis: JL. Data acquisition: JL, CT, LF, CF, AL, CK, RB, SC, PJ, BR. Writing-original draft preparation: JL, CT, FG. Visualization: JL, CT, FG. Supervision: CT, FG. All authors have read and agreed to the published version of the manuscript. Guarantor: FG.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.