Article Text

Download PDFPDF

Current challenges for assessing the long-term clinical benefit of cancer immunotherapy: a multi-stakeholder perspective
  1. Casey Quinn1,
  2. Louis P Garrison2,
  3. Anja K Pownell1,
  4. Michael B Atkins3,
  5. Gérard de Pouvourville4,
  6. Kevin Harrington5,
  7. Paolo Antonio Ascierto6,
  8. Phil McEwan7,
  9. Samuel Wagner8,
  10. John Borrill8 and
  11. Elise Wu8
  1. 1 PRMA Consulting, Fleet, UK
  2. 2 CHOICE Institute, University of Washington, Seattle, Washington, USA
  3. 3 Georgetown University Medical Center, Washington, DC, USA
  4. 4 ESSEC Business School, Cergy-Pontoise, France
  5. 5 Royal Marsden Hospital NHS Trust, London, UK
  6. 6 Istituto Nazionale Tumori IRCCS Fondazione Pascale, Napoli, Italy
  7. 7 Centre for Health Economics, Swansea University, Swansea, UK
  8. 8 Bristol-Myers Squibb, New York, New York, USA
  1. Correspondence to Dr Anja K Pownell; apownell{at}


Immuno-oncologics (IOs) differ from chemotherapies as they prime the patient’s immune system to attack the tumor, rather than directly destroying cancer cells. The IO mechanism of action leads to durable responses and prolonged survival in some patients. However, providing robust evidence of the long-term benefits of IOs at health technology assessment (HTA) submission presents several challenges for manufacturers. The aim of this article was to identify, analyze, categorize, and further explore the key challenges that regulators, HTA agencies, and payers commonly encounter when assessing the long-term benefits of IO therapies. Insights were obtained from an international, multi-stakeholder steering committee (SC) and expert panels comprising of payers, economists, and clinicians. The selected individuals were tasked with developing a summary of challenges specific to IOs in demonstrating their long-term benefits at HTA submission. The SC and expert panels agreed that standard methods used to assess the long-term benefit of anticancer drugs may have limitations for IO therapies. Three key areas of challenges were identified: (1) lack of a disease model that fully captures the mechanism of action and subsequent patient responses; (2) estimation of longer-term outcomes, including a lack of agreement on ideal methods of survival analyses and extrapolation of survival curves; and (3) data limitations at the time of HTA submission, for which surrogate survival end points and real-world evidence could prove useful. A summary of the key challenges facing manufacturers when submitting evidence at HTA submission was developed, along with further recommendations for manufacturers in what evidence to produce. Despite almost a decade of use, there remain significant challenges around how best to demonstrate the long-term benefit of checkpoint inhibitor-based IOs to HTA agencies, clinicians, and payers. Manufacturers can potentially meet or mitigate these challenges with a focus on strengthening survival analysis methodology. Approaches to doing this include identifying reliable biomarkers, intermediate and surrogate end points, and the use of real-world data to inform and validate long-term survival projections. Wider education across all stakeholders—manufacturers, payers, and clinicians—in considering the long-term survival benefit with IOs is also important.

  • immunotherapy
  • healthcare economics and organizations
  • guidelines as topic
  • programmed cell death 1 receptor

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


The development of immuno-oncology (IO) drugs has represented a significant breakthrough in the treatment of cancer. Currently, single-agent and combination immunotherapy regimens have been approved by the Food and Drug Administration (FDA) and European Medicines Agency for at least 20 indications.1

IOs differ from conventional chemotherapies as they target the patient's immune system rather than directly attacking the tumor. This immune activation can lead to durable responses and prolonged survival in some, but not all, treated patients. Long-term follow-up in patients with metastatic melanoma who received treatment with ipilimumab, for example, has shown 3-year survival rates ranging from 20% to 26%, and up to 10 years in some patients,2 whereas only 1%–2% of patients with metastatic melanoma achieve a durable long-term response to chemotherapy.3 Durable responses have also been observed in multiple other tumor types. For patients treated with nivolumab, overall survival (OS) curves have been seen to flatten at around 3 years, with estimated 5-year survival rates of approximately 30% for renal cell carcinoma, and 16% for non-small cell lung cancer (NSCLC).4 Even higher long-term survival rates (ie, plateaus) appear to be likely in patients receiving IO-combination therapy for these tumor types.5–7

Assessing the value of a new oncology drug is a key goal of health technology assessment (HTA) and pricing and reimbursement negotiations. While HTA agencies worldwide differ in terms of their assessment criteria and methods, the gold standard efficacy criterion in oncology is OS. The outcome is clear and measurable, and the benefit of longer survival is irrefutable. However, providing robust OS evidence requires large trial sample sizes and often many years of trial follow-up, which can potentially deny or delay critically ill patients access to life-extending therapies. Manufacturers are therefore faced with the challenge of demonstrating the potential long-term benefit of their IO therapies with short-term survival data.


The aim of this article is to report insights from multi-stakeholder interviews and expert panels, as well as the scientific literature, on the challenges regulators, HTA agencies, and payers commonly encounter when assessing the long-term benefits of IOs across a range of tumor types.


A pragmatic review of published literature in PubMed was conducted in September 2017 with the aim of gaining a deeper understanding of the topic to be addressed. Articles on IO or cancer immunotherapy were identified which contained data on long-term survival outcomes. Since IO therapies and their long-term benefits are relevant to multiple therapeutic areas, the searches and subsequent discussions were not focused on a single tumor type. A total of 27 published papers were identified for the topics extrapolation, long-term survival, survival modeling/analysis, alternative metrics/end points, surrogates, non-proportional hazards, and cure modeling. In addition, 86 reimbursement submissions were considered for ipilimumab, nivolumab, and pembrolizumab from agencies in Australia, Canada, Germany, Sweden, and the UK.

Multi-stakeholder interviews to develop this document were conducted between July 2017 and February 2018 (figure 1). A steering committee (SC) was convened which comprised nine payers, economists, and clinicians from the US, UK, France, Italy, and Sweden. They were tasked with developing a summary of the challenges that were specific to demonstrating the long-term benefit of IOs. Once the summary had been developed, it was reviewed and refined by conducting double-blind multi-stakeholder expert panels from the US, UK, France, Germany, and Sweden.

Figure 1

Overview of research methodology. HTA, health technology assessment.

These five expert panels were asked to evaluate themes from the summary of challenges in terms of how they applied to each country’s assessment criteria. In line with the search criteria, participants were not asked to focus on a single tumor type. The panel sessions were run over 1 day, lasting around 6 hours each. The themes were either confirmed, amended, or replaced, and additions were also allowed. Recommendations to manufacturers for how best to meet and overcome the challenges in future submissions were reviewed. The SC then finalized the summary of challenges and a corresponding list of questions for manufacturers to consider at HTA submission.

The SC meetings and face-to-face expert panels were moderated by PRMA Consulting. Purposeful sampling approaches were used to identify potential participants for this research, based on the existing research networks and online searches. Panelists were recruited via email; six were recruited from each of the US, UK, and France, five from Germany, and four from Sweden.


The summary of challenges for evaluating and communicating the long-term benefits of IOs at HTA submission is presented in table 1. The challenges correspond with questions for manufacturers to consider and are grouped into key areas: how the mechanism of action of IO therapy may impact on longer-term survival; the estimation of the survival benefit; and limited data available at the time of HTA submission.

Table 1

Summary of key challenges in presenting the value of IOs in HTA submissions

Mechanism of action

The underlying mechanism of action of IOs and the resulting tumor response underpin the challenges in demonstrating the potential long-term survival benefit of these drugs. Response patterns can differ significantly from traditional anticancer drugs such as chemotherapy, making subsequent survival analyses more challenging, principally because of delayed separation of survival curves between therapies.8–12 The effect of IOs in patients who respond can roughly be divided into three stages (figure 2):

Figure 2

Typical Kaplan-Meier survival curves observed with IO therapies. IO, immuno-oncology.

  • Non-separation of the Kaplan-Meier (KM) survival curves for IO and standard chemotherapy during the initial treatment phase (ie, the first 3–6 months).

  • Separation of the KM curves as activation of the immune cells leads to a clinically measurable antitumor effect in patients receiving IO therapy, and those receiving chemotherapy develop resistance to treatment.

  • Plateauing of the tail of the IO KM curve many months after the first administration and continuing long after treatment have ceased.8 13–15

In clinical trials of IOs, both conventional and non-conventional response patterns have been observed.16–19 Conventional response patterns include complete response with an early reduction in tumor burden and stable disease. Non-conventional response patterns, unique to IOs, can also occur. These can include a delayed response; an early reduction in target tumor burden but accompanied by new lesions; an initial increase in tumor burden followed by a decrease in tumor burden (so called ‘pseudo-progression’); and accelerated tumor growth (hyperprogression), which may also be indicative of aggressive disease and not necessarily a response to IO therapy.16 Although only a small proportion of patients (around 10%) may experience pseudo-progression, this type of response may lead to premature discontinuation of an effective treatment or the delay in starting a new therapy.16 18 20

IOs have additionally proven capable of inducing durable treatment responses that can continue for years after treatment discontinuation for a proportion of patients—another non-traditional response pattern.2 11 This typically manifests as the ‘flattening’ or ‘plateauing’ shape of the IO KM survival curve and suggests that some patients may be potentially ‘cured’ (ie, their expected survival is comparable with that of the general population matched for age and sex). Nevertheless, durable responses to IOs cannot currently be predicted with accuracy at the initiation or early on in the treatment.


In oncology, we have entered an era of precision medicine which enables clinicians and researchers to predict with greater accuracy which treatments will be most effective for which groups of patients. Several biomarkers have been investigated to enable the prediction of response to immunotherapy and long-term survival. However, the dynamic nature of the immune system means that it changes during immune responses, making it difficult to identify a single biomarker to predict responses.21 22 Programmed death-ligand 1 (PD-L1) has been explored extensively and has shown some ability to differentiate patients,23 24 although IO products, even in studies targeting PD-L1, continue to be approved with broader indications, as efficacy data showed benefit also in patients with tumors without PD-L1 expression25–27

Work continues on the identification of biomarkers and genetic factors such as tumor mutational burden,28–31 gene expression profiling of the tumor microenvironment,31 microsatellite instability (MSI) in tumors,32 33 microbiome status,34 35 analysis of immune system and cancer interactions,36 and a more holistic ‘immune scoring’ approach.37–40 Pembrolizumab has been approved in the USA for the treatment of adult and pediatric patients with unresectable or metastatic MSI-High or mismatch repair-deficient solid tumors that have progressed after treatment, irrespective of their specific site of origin.41 Although biomarkers such as MSI can be predictive and prognostic, the percentage of patients with tumors characterized as MSI-High remains small. In clinical trials for metastatic colorectal cancer, for instance, only 3.5%–5% of patients had MSI-High tumors.42

Estimation of longer-term outcomes

Survival data from clinical trials must be extrapolated using validated and established statistical methods to estimate long-term survival benefit,43 as typically no clinical trial is long enough in duration to capture this. Several studies have explored the methods of survival analysis and extrapolation within the context of IOs and criticized their performance since they often underestimate the long-term survival benefit and thus the value of IO treatments.8 10 44 45 Innovative modeling techniques are needed to handle data immaturity and the response patterns of IOs. Estimating survival can be broken out into two general areas: within-trial analysis and extrapolation.

Within-trial analysis of survival

Median survival and hazard ratios are commonly used to assess survival in HTAs. However, for IO therapy these may fail to capture the magnitude of survival benefit with the available data, because they do not adequately capture the non-conventional response patterns of IOs, such as the delayed separation of survival curves (figure 2).46 47 A proportional hazard ratio assumes that the ratio of two hazard functions is approximately constant over time, and when this assumption is plausible such a ratio may capture the relative difference between two survival curves. However, when the delayed separation of survival curves is present and/or a plateau is evident, as often observed with IO therapies, the proportional hazards assumption is often violated. The result is a potential loss of statistical power to demonstrate the difference between the treatment arms.11 Alternative metrics exist that suit the IO mechanism of action, including landmark survival analysis48 and restricted mean survival time.9 Both capture the survival benefit in the tail of the curve, and landmark survival analysis can also better capture the changing hazard function over time and provide data to support the plateauing effect.49–51

Estimating long-term survival by extrapolating from clinical trial data

In the second stage of estimation, long-term, including lifetime, survival benefit is shown by extrapolating from available data. Extrapolation is of interest for HTA agencies that use economic evaluation to inform their decisions. In this circumstance, lifetime costs and health benefits (life years or quality-adjusted life years) must be estimated. The standard method of survival extrapolation is parametric regression modeling, which is more suited to traditional chemotherapeutic regimens.46 The mechanism of action for IOs exacerbates multiple methodological challenges: assessing non-proportional hazards, the plateau effect, loss of statistical power in the tail of the survival curves due to censoring, and unobserved heterogeneity in the patient population. More flexible and complex approaches, such as piecewise models, cubic splines, response-based models, cure fraction modeling, and mixture cure modeling, may be needed to capture the characteristic IO pattern of delayed treatment effects and, for a subset of patients, the plateau of long-term survival.

Parametric regression models

Parametric regression-based modeling is an approach that is commonly used in HTA43 to estimate the probability of dying, period-by-period, using all available data (ie, the time period and whether a subject is known to be alive, known to be dead, or has been censored). Other factors such as age, sex, and/or treatment regimen can be added as covariates for adjustment. The most commonly used are the exponential, Weibull, Gompertz, log-logistic, and log-normal distributions.52

The assumptions that underlie how single regression models predict the hazard function may not be appropriate for IOs, where hazards are more complex due to the mechanism of action and changing patient population over time. As patients with an immune response come to be the only survivors, the underlying pattern of survival changes.8 10 44 As such, more flexible approaches might capture this and better fit the tail of the survival curve when it flattens.

Expansion to piecewise models

Expansion to piecewise regression modeling has arisen as an attempt to address treatments like IOs in which a subgroup of patients within a cohort survives at much higher rates. This results in an initial period wherein all patients are represented, but in later periods only a certain type of patient is represented (ie, those with immune response). There have been multiple applications of these models in HTAs across a range of tumors.53–60 This approach is still a relatively conventional method and can also be used to deal with convergence or divergence of hazard functions between treatments in later periods.

Flexible parametric methods

Flexible parametric models have two main advantages: first they smooth hazard functions, but more closely than parametric models, and second they use piecewise polynomials, which can adjust to a range of hazard function shapes, making them flexible with regard to proportional hazards and monotonicity.61 Covariates can also be introduced to handle time dependence of hazards and describe hazards for different patient groups. Flexible parametric methods can be valuable for modeling survival with IOs because of the ability to isolate periods of immunological response, long-term remission among responders, time-dependent hazards, and covariates.62 For example, flexible parametric models were successfully used to extrapolate OS in HTAs of nivolumab in NSCLC and avelumab in metastatic Merkel cell carcinoma.63–65

Mixture models

While flexible parametric models give flexibility to fitting survival data, they have a limitation in that their parameters describe only the relationships between the observed variables. Finite mixture models, latent class models, or cure-fraction models are implemented based on using observed variables to determine unobserved latent classes for analysis. For IOs, this is latent classes of patients who are more or less likely to experience immunological response and long-term remission.8 10 66 67 Finite mixture models are not widely used in oncology but they are recognized as a potentially suitable method.68 69 Mixture cure models have been used to extrapolate OS data in submissions to the UK National Institute of Health and Care Excellence (NICE) and the Canadian Agency for Drugs and Technologies in Health for atezolizumab for the treatment of locally advanced NSCLC after chemotherapy.70 71

Data limitations at the time of HTA submission

Given the challenges in accurately estimating the long-term survival for immunotherapies, supplemental evidence for supporting the clinical benefit of these drugs has been put forth, including surrogate end points to complement survival analyses and real-world evidence (RWE).

Surrogate end points

Surrogate end points offer manufacturers an alternative intermediate measure to link the benefit seen in clinical trials with long-term patient survival and are increasingly being used in oncology. Outcomes based on surrogate end points are available sooner than OS and, in the case of IOs, represent a potential solution to the challenges of collecting sufficiently mature OS data. According to the FDA, a surrogate end point is a marker, such as a laboratory measurement, radiographic image, physical sign, or other measure, that is not itself a direct measurement of clinical benefit, but it is (1) known to predict clinical benefit and could be used to support traditional approval of a drug or biological product; or (2) reasonably likely to predict clinical benefit and could be used to support the accelerated approval of a drug or biological product.72 In oncology, progression-free survival (PFS), disease-free survival, event-free survival, and durable objective overall response rate (ORR) are among the surrogate end points that have been used for accelerated approval by the FDA. However, challenges remain with the manufacturer to evaluate the statistical correlations between intermediate end points and OS to establish validated surrogate end points that can be acceptable by country-specific regulators and payers.72–78

Despite some surrogate end points being accepted by regulators, there is conflicting evidence on their reliability in predicting meaningful survival benefit for IO therapy.77 79 80 Traditionally, surrogate end points for OS have been PFS and response-based end points such as ORR. For IOs, criteria for tumor response and progression were updated in a set of alternative immune-related response criteria (irRC) to capture the response with immunotherapy; however, the association between irRC and OS has not been validated.81 Landmark PFS rates at 1 year, 2 years, and 3 years have also been proposed to be consistently reported end points in clinical trials of patients with melanoma that could serve as surrogates.82 83

A recent systematic review, which focused on the alternative surrogate end points and their association with OS in IOs, found that there are currently insufficient data to support a validated surrogate end point for OS.20 The two most promising composite end points in the review are considered durable response rate (DRR) and intermediate response end point (IME).20 DRR, a combination of standard response criteria and a prospective duration dimension of 6 months, was highly associated with OS. For the IME, a more complex analysis is required; IME is based on non-target lesion progression, new lesion, and target lesion information determined by baseline tumor burden, tumor reduction depth, and tumor change dynamic within 1 year after randomization. The authors found the association between IME and OS to be relatively strong. Treatment-free interval and treatment-free survival are also described as potential surrogates for hematologic malignancies. The authors noted that there was considerable heterogeneity in the statistical methods used in individual studies assessing surrogate end points and there remains a need to standardize approaches to reach a consensus.

Use of RWE

Historically, another limiting factor when assessing new therapies supported by immature data has been that the underlying processes do not allow for follow-up assessment with or without RWE. Separate trends in adaptive licensing at a regulatory level, and conditional reimbursement pathways, reflect this.84 85 RWE in assessing long-term survival benefit of IOs is not yet commonly used in HTA86 87 but has two potential applications: first, to provide data for modeling the survival benefit that can help generalize clinical trial data to real-world clinical practice88 89; and second, to validate externally both the clinical trial data, particularly from later stages of follow-up, and any predictive modeling of long-term survival benefit.90 91

RWE includes analysis of data gathered from non-randomized sources, such as patient registries and observational studies, among others. The FDA does not demand RWE for approval as these data are usually delayed, relative to clinical trial data; however, they have recently become more receptive.92 As real-world studies do not adhere to the same degree of controlled conditions and predefined patient-management strategies as clinical studies, RWE is still considered to be of lower quality and less reliable than randomized controlled trial (RCT) data.93 Issues can depend on whether data are retrospective or prospective, but include selection bias at a patient and treatment center level,94 missing data on confounders, and low interoperability in general.86 The recording of drug-related toxicities in clinical practice also differs to the criteria used in clinical trials. Despite this, RWE provides data that are not readily available in an RCT, such as long-term outcomes.93 Thus, some HTA agencies, such as the French Transparency Commission or NICE, will provide reimbursement conditional on further efficacy and safety data being collected. The collected postlaunch data provide clinicians with a greater understanding of the long-term safety of IO therapies, and they are of value to HTA agencies during HTAs of new IO therapies or in reassessments.


Our findings indicate that, despite over a decade of use, there remain significant challenges in how best to demonstrate the long-term benefit of IOs to HTA agencies, clinicians, and payers. We have presented a succinct summary of the key challenges facing manufacturers when submitting evidence at HTA in demonstrating the full value of IOs for a range of tumor types, and further questions for consideration, in table 1. Many of these are well known and have been discussed in published literature, but they currently remain largely unresolved.

In general, longer-term follow-up should be standard practice for IO clinical trials. Long-term response and OS data could alleviate many of the current challenges. Research on biomarkers or other predictors of response and durable survival should also continue, as a successful biomarker could help clinical trials management and still provide timely patient access—a key goal in all drug development. As IOs may potentially be used in earlier stages of disease in the near future, surrogate end points will become even more important and should be validated. Greater development and use of RWE spans all stakeholders and is a key part of overcoming the gap between RCT data with low external validity and properly measuring long-term survival benefit.

From the perspective of manufacturers, the recommendations primarily involve further analysis and communication. Analysis of survival data should continue to explore methods that are appropriate to the mechanism of action, as standard measures of survival benefit are not appropriate. Manufacturers should also continue to communicate the mechanism of action and its impact on long-term response, survival benefit, and how it changes outcomes data and methods of analysis.

We have attempted to synthesize in table 1 a structured way in which manufacturers should link the mechanism of action of IOs to subsequent expectations of the shape of the long-term survival curve such as the plateau effect. This links logically into the statistical and economic modeling of long-term survival, focusing on model structure and survival curve extrapolation methodology, but also accounting for the limited clinical trial data at HTA submission.

From the perspective of HTA agencies, there is an apparent need for more explicit consideration of the same issues around mechanism of action and survival analysis, including how it is incorporated into economic modeling. HTA agencies can help effect change in modeling by manufacturers by explicitly recommending modeling approaches that acknowledge and reflect the underlying biology and natural history of disease and treatment effect, rather than attempts at statistical curve fitting. HTA agencies can also increase communication with manufacturers prior to submission to align on appropriate methods of analysis.

Payers can also help these processes to evolve with formal reviews of past decisions. Retrospective reviews will help to improve the use of surrogates and the potential for long-term data to be used in the future to update decisions. As well as helping to better understand how past data, methods, and decisions appear today, this would support the use of conditional approvals and reimbursement based on surrogate end points with confirmatory real-world or long-term follow-up data.

Assessment processes and preferred methodology differ across countries, meaning that there is some limit to how much consistency in assessing IO survival benefit is achievable. However, an European Union-wide cooperation on HTA has been proposed recently by the European Commission, focusing on relative effectiveness assessment (REA) for pharmaceuticals and medical devices.95 In an editorial assessing this proposal, Kanavos et al highlighted that the HTA framework must be more explicit and realistic about clinical value definition, what constitutes quality of evidence, how RWE is handled, and how to ensure consistency in REA interpretation.96 They concluded that this initiative can deliver wider benefits, a key one being member states having more resources to assess performance of interventions in their healthcare systems.


We have presented here the key challenges faced when demonstrating long-term survival benefit from treating patients with IOs: challenges linked to the IO mechanism of action, the analysis and extrapolation of long-term survival data, and using data that may be immature at the time of HTA submission. We outlined ways in which manufacturers can meet or mitigate these challenges, with a focus on strengthening survival analysis methods to capture the underlying biology of disease and treatment, rather than just statistical curve fit, including the potential use of biomarkers, surrogate end points, and RWE in modeling.

It is crucial that patients have timely access to novel therapies and particularly breakthrough therapies such as IOs. Therefore, we outlined the steps that manufacturers are recommended to take to develop and submit evidence to HTA bodies in a structured and consistent way, and to drive wider education across all stakeholders—manufacturers, payers, and clinicians—in considering long-term survival benefits with IOs.


The authors wish to thank Diana Steinway who provided medical writing services on behalf of Bristol-Myers Squibb and PRMA Consulting.



  • Contributors The authors took full responsibility for the content of this publication and confirmed that it reflects their viewpoint and expertise and have approved the submitted version. The authors confirmed that GPP3 guidelines were followed throughout the development of the paper. The authors received no financial compensation for authoring the paper.

  • Funding This research was funded by Bristol-Myers Squibb.

  • Competing interests AKP is an employee of PRMA Consulting, who conducted this research for Bristol-Myers Squibb, the sponsor of the research. At the time of the study, CQ was also an employee of PRMA Consulting. SW and JB are employees of Bristol-Myers Squibb, the sponsor of the research. At the time of the study, EW was also an employee of Bristol-Myers Squibb. LPG, MBA, GdP, KH, PAA, and PM received consultancy fees from Bristol-Myers Squibb.

  • Patient consent for publication Not required.

  • Provenance and peer review Not commissioned; externally peer reviewed.