Background Response criteria developed when cytotoxic chemotherapy was the predominant therapeutic modality to treat patients with cancer, do not capture the full spectrum of tumor response patterns observed with anti-PD-1/PD-L1 antibody treatment. iRECIST was developed to capture both typical and atypical response patterns.
Methods Target, non-target, and new lesion measurements for 7920 patients receiving anti-PD-1/PD-L1 antibody (n=4751) or anti-CTLA-4 antibody (n=613) or undergoing chemotherapy (n=2556) from 14 randomized controlled trials submitted to the U.S. Food and Drug Administration were used to calculate the best overall response, objective response rate and progression-free survival (PFS) per iRECIST (iPFS) and Response Evaluation Criteria in Solid Tumours (RECIST). Associations between either PFS or iPFS and overall survival (OS) were evaluated using the method adopted by Oba et al.1
Results Among 4751 anti-PD-1/PD-L1-antibody treated patients, 31.5% (95% CI 30.2% to 32.9%) and 30.5% (95% CI 29.2% to 31.8%) achieved an objective response per iRECIST or RECIST V.1.1, respectively. OS among the 48 patients with objective response by iRECIST only resembled that in patients with responses per RECIST V.1.1. The association between iPFS and OS was R2=0.277 and that between PFS and OS was R2=0.260.
Conclusions Patients treated with anti-PD-1/PD-L1 antibodies with initial progressive disease per RECIST V.1.1 can experience prolonged stability or substantial reductions in tumor burden per iRECIST, atypical response patterns associated with prolonged OS. In the subgroup of patients with atypical responses, the application of iRECIST retrospectively in the evaluation of the objective response durations and the magnitude of PFS results in large differences compared with RECIST V.1.1. For the overall pooled population, the magnitude of these differences was modest, although a large proportion of patients had no further tumor assessments following RECIST V.1.1-defined progressive disease. Prospective studies employing iRECIST will be required to assess whether this response criteria more fully captures the benefit of immune checkpoint inhibitors.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
The U.S. Food and Drug Administration (FDA) has considered large durable treatment effects on tumor burden and large effects on progression-free survival (PFS) in randomized controlled trials (RCTs) to predict effects on overall survival (OS), a direct measure of clinical benefit.2 Because these endpoints can be evaluated earlier than OS to characterize a drug’s efficacy, and these endpoints are not confounded by crossover, clinical trials routinely assess treatment effects on PFS and objective response rate (ORR). Identifying the optimal algorithm to detect changes in tumor burden that correlate best with OS is particularly important in the regulatory setting to ensure that new, safe, and effective therapies are accessible to patients as soon as possible.
The criteria to characterize treatment effects on tumor have evolved to maintain or improve accuracy while limiting administrative costs and other burdens. The first generally accepted bidimensional criteria, the WHO response criteria (1981),3 was replaced by unidimensional Response Evaluation Criteria in Solid Tumours (RECIST) (2000)4; the current standard is RECIST V.1.1,5 a widely used, standardized algorithm for characterizing tumor response and tumor progression in clinical trials. Investigators, the pharmaceutical industry, and regulatory agencies accept ORR, duration of response (DOR), and PFS as assessed by conventional response criteria (eg, RECIST V.1.1) as valid measures of clinically meaningful changes in tumor burden, describing treatment effects supporting some of the new drug approvals.6
However, RECIST V.1.1 does not capture the atypical patterns of tumor response described with ipilimumab and anti-PD-1/PD-L1 antibodies. These atypical responses include initial increase in tumor size followed by a clinically important reduction in tumor burden (‘tumor flare’/pseudoprogression) or initial reduction in tumor size with appearance of new lesions that subsequently regress. To account for atypical tumor responses, several modified response criteria have been proposed, including the immune-related response criteria,7 immune-related RECIST,8 immune-modified RECIST,9 and immune RECIST.10 These criteria differ in their consideration of new lesions (ie, whether new lesions indicate progression and/or incorporation of new lesions in the measurement of tumor burden), requirements for confirmation of progression, and use of bidimensional or unidimensional measurements of tumor lesions (online supplementary table 1).
Evidence to support whether these novel response criteria better assess outcomes in patients in trials evaluating immunotherapeutics (as compared with RECIST V.1.1) is limited. To provide insight into the relative performance of iRECIST, as immune-based response criteria are increasingly employed in cancer immunotherapy trials, we conducted a retrospective assessment of response (ORR, best overall response (BOR), and PFS) according to iRECIST and RECIST V.1.1.
All trials that evaluated the safety and efficacy of an anti-PD-1/PD-L1 antibody and were submitted to FDA between September 2014 and September 201711–24 were assessed for inclusion of a randomized active control and the potential opportunity to treat patients beyond initial RECIST V.1.1-defined progression. We identified 14 multicenter RCTs, 11 open-label and 3 double-blind, in patients with melanoma, squamous/non-squamous non-small cell lung cancer (NSCLC), renal cell carcinoma (RCC) and head and neck squamous cell carcinoma (HNSCC) meeting these criteria.
Data extraction and analysis
Patient-level data from investigator-assessed tumor measurements were used to calculate the response status at each time point according to RECIST V.1.1 and iRECIST (online supplementary table 2). Patients for whom there was only a baseline tumor assessment with no postbaseline assessments, who received non-protocol anticancer therapy prior to the first post-baseline assessment, or with all assessments identified as non-evaluable in the datasets were categorized as ‘not-evaluable’ in our analyses of ORR, a subcategory of non-responder.
Definitions and outcomes
The analyses of ORR were conducted in the ‘as-treated’ population defined as patients who received at least one dose of protocol-specified therapy. ORR was defined as the proportion of patients achieving a complete response (iCR/CR) or partial response (iPR/PR) per iRECIST or RECIST V.1.1, respectively. iBOR/BOR was defined as the single best response status at any evaluation assessment timepoint prior to receipt of non-protocol therapy or prior to progression of disease (PD) by RECIST V.1.1 and prior to confirmed progression of disease (iCPD) by iRECIST. In accordance with iRECIST, patients with stable disease (iSD) or better after an initial unconfirmed progression of disease (iUPD) were evaluated after iUPD for iCR/iPR/iSD in the determination of iBOR.
Analyses of OS and PFS were conducted in the intent-to-treat (as randomized) population. OS was measured from the date of randomization until death; in accordance with FDA Guidance,2 data were censored at the data cut-off date specified in the clinical study report for each trial. Our analyses of progression-free survival (ie, iPFS per iRECIST and PFS per RECIST V.1.1) were calculated from the date of randomization to the date of disease progression by iRECIST/RECIST V.1.1 or death, whichever occurred earlier. In the analyses of PFS per RECIST V.1.1, patients not experiencing progression or death were censored as of their last tumor assessment prior to any subsequent non-protocol anticancer therapy. In the analyses of iPFS, patients with iUPD at their last assessment were assigned a date of progression at the earliest timepoint iUPD was consecutively determined for that patient (ie, sequential iUPD determinations without an intervening iSD, iPR, or iCR determination).
Associations between survival and response status in subgroups of patients where responses per iRECIST and RECIST V.1.1 differed were analyzed and plotted separately.
Kaplan-Meier25 methods were used to estimate iPFS/PFS and OS. The association between iPFS/PFS and OS was evaluated using a weighted linear regression model with weights equal to the sample size of each comparison. The strength of this association was measured by the coefficient of determination (R2) from the weighted linear regression model, where values close to 1 represent strong association and those close to 0 represent lack of association. Estimated treatment effects of iPFS, PFS and OS were calculated as the log of the HR from an unstratified Cox proportional hazards model with study arm as the covariate. The analysis for this paper was generated using SAS software V.9.4. Survival figures were generated using R V.3.4.3 and RStudio V.1.1.456.
Among the 8170 randomized patients, 4802 (59%) were randomized to receive an anti-PD-1/PD-L1 antibody; 642 (8%) were randomized to receive an anti-CTLA-4 antibody; and 2728 (33%) were randomized to receive chemotherapy. The as-treated population included 7920 patients receiving at least one dose of study-specified therapy, comprising 4751 (60%) anti-PD-1/PD-L1 antibody-treated patients, 613 (8%) anti-CTLA-4 antibody-treated patients, and 2556 (32%) chemotherapy-treated patients (figure 1).
ORR was 31.5% by iRECIST and 30.5% by RECIST V.1.1 for anti-PD-1/PD-L1 antibody-treated patients, for a difference in ORR of 1%. The differences in ORR by iRECIST and RECIST were even smaller for those treated with an anti-CTLA-4 antibody (19.7% vs 19.2%) or chemotherapy (15.2% vs 15%) (table 1). There were 232 (4.9%) of the 4751 anti-PD-1/PD-L1 antibody-treated patients who achieved an iCR or iPR (n=133 (2.8%)) or iSD (n=99 (2.1%)) after RECIST V.1.1-defined progression (table 2). Of these 133 patients with iPR/iCR, 85 (64%) also achieved a CR or PR prior to their RECIST V.1.1-defined progression, whereas 48 (36%) patients had stable (n=11) or progressive disease (n=37) as their best response by RECIST V.1.1 (ie, iRECIST-only responses) (table 3). The median DOR among all responding patients by iRECIST was 10.1 months (range: 0–33.4) and that by RECIST V.1.1 was 9.2 months (range: 0–33.3) (table 1). Among the 138 patients with a DOR that differed based on response criteria, the median iDOR was 14.1 months (range: 0–33.4) and the median DOR was 6.1 months (range: 0.9–24.9) (online supplementary figure 1).
Among the 37 patients with iRECIST responses whose BOR by RECIST V.1.1 was PD, evidence of PD by RECIST V.1.1 was based on an increase in existing lesions in 57% of patients (target lesions increase in 11 patients (30%); increase in non-target lesions in 9 patients (24%), and increase in non-target plus new lesions in 1 patient (3%)) and on an appearance of new lesions in the other 43%. The spider plots summarizing the change in tumor burden over time by type of progression are presented in figure 2A–C. As illustrated in the spider plots, all PD events occurred within 14 weeks of initiation of treatment with a median time to iUPD of 8 weeks (min, max: 1, 14).
In the analyses of PFS, 62% of those randomized to anti-PD-1/PD-L1 antibodies were identified as having PD per RECIST V.1.1 and 59% per iRECIST. Of the 2832 patients with PD per iRECIST, 61% had iUPD assigned at the time of PD per RECIST and had no further disease assessments; 18% had iUPD with subsequent disease assessments that did not confirm PD (ie, iCPD not assigned); and 21% were subsequently identified as having iCPD. In total, there were 232 patients with PD by RECIST V.1.1 who had a longer duration of iPFS based on subsequent determination of iSD/iPR/iCR at a later timepoint, resulting in either censoring at the last assessment if ongoing (n=116, figure 3A) or in a later PFS event (n=116, figure 3B) based on iCPD (n=12), iUPD at the last assessment (n=93), or death (n=11); specifically, the respective median iPFS and median PFS durations were 18.7 (min, max: 1.3, 36.1) and 5.4 (95% CI 4.0 to 6.2) months in the iRECIST-censored subgroup and were 10.1 (95% CI 8.4 to 10.9) and 2.8 (95% CI 2.6 to 4.1) months in the subgroup with PFS events documented later by iRECIST. In the overall PFS analyses, the estimated median iPFS was 4.2 months (95% CI 4.0 to 4.4) compared with an estimated median PFS of 3.9 months (95% CI 3.6 to 4.1).
In the analyses of OS by response status in all randomized patients and in anti-PD-1/PD-L1-treated patients (figure 4A,B), patients with responses only per iRECIST appear to have similar survival initially compared with those achieving PR or CR under RECIST V.1.1. Among the anti-PD-1/PD-L1-treated group, patients who achieved iSD after RECIST V.1.1 progression appeared to have similar survival initially to those achieving SD by RECIST with divergence of the curves at later timepoints. The estimated median OS time was 12.1 months (95% CI 8.9 to NR) for those with iSD only and 16.2 months (95% CI 14.9 to 18.2) for those with SD per RECIST V.1.1.
The correlations of PFS per RECIST V.1.1 and per iRECIST with OS were analyzed in the evaluation of PFS (RECIST V.1.1) and iPFS (iRECIST) as a surrogate endpoint for OS. There was a minimal increase in R2 value for the association of iPFS and OS compared with that for PFS and OS (R2=0.277 vs R2=0.260) (online supplementary figure 2 a, b).
Trials intended to demonstrate the efficacy of anticancer drugs and biologics commonly evaluate tumor measurement based endpoints—such as ORR with prolonged durations of response and PFS—as the primary outcome measure and, in the context of the clinical setting of the disease, the magnitude of the treatment effect, and the risk–benefit profile, may support an accelerated or regular approval.2 There is intense interest in the oncology community for use of immune-based response criteria rather than the conventional criteria, given the observed pattern of atypical responses and that conventional criteria are reported to underestimate the ORR for cancer immunotherapeutics by up to 15%.26
Notably, the incremental percentage of immune-based responders by iRECIST is lower than that in previous reports using other immune-based criteria27 as our analysis did not consider patients with iSD following RECIST V.1.1 PD as responders, since this is not clear evidence of a drug effect and may represent the natural history of the disease in that patient.26 These analyses conducted across 14 clinical trials provides a large database in which ORR per iRECIST was similar to that per RECIST V.1.1 (31.5% vs 30.5%), a finding that was consistent across tumor types, with 2.8% of anti-PD-1/PD-L1 antibody-treated patients achieving an iCR or iPR after progression based on RECIST V.1.1. This includes the approximately 1% of patients with early RECIST V.1.1 PD who appear to derive benefit based on subsequent durable tumor responses as illustrated in online supplementary figure 1 and Figure 2 a-c. Additionally, some patients with RECIST V.1.1 responses experience iUPD followed by iSD, thereby increasing the durability of the response. Taken together, durability of response assessed by iRECIST led to a longer median DOR, a difference that was modest in the overall analysis population (~1 month) but potentially meaningful within the evaluation of specific therapeutic categories, for example, the anti-CTLA-4 antibody subgroup (~3 months increase in median DOR with iRECIST). Thus, while RECIST V.1.1 appears to capture most of the treatment effect based on ORR for anti-PD-1/PD-L1 antibodies, the use of this response criteria may result in early termination of effective treatment in a limited number of patients who may also exhibit prolonged survival similar to that observed in patients with RECIST V.1.1 response.
Patients with prolonged iSD, while not considered in the assessment of ORR, could substantially impact the assessment of treatment effects on PFS. Some patients with initial disease progression followed by disease stability are considered early progressors by RECIST but prolonged SD by iRECIST, thus improving the correlation between PFS and survival. Among anti-PD-1/PD-L1 antibody-treated patients, the estimated median iPFS was 4.2 months (95% CI 4.0 to 4.4), and the estimated median PFS was 3.9 months (95% CI 3.6 to 4.1) by RECIST V.1.1. Ultimately, PFS is a time-to-event tumor measurement-based endpoint that requires a control arm, typically in a randomized trial, to interpret the meaningfulness of a treatment effect on this outcome. The associations between iPFS and OS and between PFS and OS were weak and similar. Thus, on a trial-level basis, both tumor response criteria yield similar results. However, conclusions regarding the minimal differences in R2 value when evaluating the utility of iPFS as a surrogate for OS in immunotherapy trials compared with PFS are limited by the fact that this analysis included multiple cancer subtypes (melanoma, NSCLC, RCC, and HNSCC), and PFS per RECIST V.1.1 has not been validated as a surrogate for survival in all of these cancer types. Additionally, not all studies had mature OS follow-up, potentially affecting these results.
Although this analysis includes a large number of studies and patient data, it is limited by its retrospective nature and the fact that none of the trials were conducted according to iRECIST. Collection of non-target lesion tumor data in case report forms does not provide adequate information for characterization of response status by iRECIST since subsequent increase in size, which is used to confirm disease progression, was not collected after first designation of ‘unequivocal progression’. In our analyses, we took a conservative approach, assigning confirmation if unequivocal progression was indicated at the next assessment. Additionally, measurements for new lesions were not always recorded, thus confirmation based on an increase in lesion size could not always be made. Finally, iRECIST requires that if a patient discontinues prior to confirmation (iCPD), the reason for discontinuation should be captured in case report forms, as discontinuation is medically appropriate in some settings. The number of patients without confirmation of disease progression in our analysis is considerable; 79% of patients with iUPD were not confirmed as of the data cut-off date for the study or prior to leaving the trial, including the 61% of patients with PD per RECIST V.1.1 and no further disease assessments. The effects of missing data regarding confirmation of iUPD on determination of iPFS have not been assessed in this retrospective analysis.
The use of immune-based tumor response criteria as the primary assessment of tumor measurement-based endpoints has been hampered in part by the multiple response criteria that have been proposed and variations therein, complexity of individual criteria, differential application in randomized trials where immune-based response criteria are applied only to the experimental arm (immunotherapy) and not the control arm (eg, a chemotherapeutic regimen), and increasing reliance on single-arm trials to identify clinically meaningful treatment effects on ORR and DOR. Employing novel response criteria is particularly challenging for single-arm trials as the assessment of treatment effects relies on historic controls with cross-study comparisons confounded by use of disparate response criteria. While adoption of iRECIST in immunotherapeutic trials may address some of the limitations of RECIST V.1.1 with respect to patient management—limitations that are often overcome on a case-by-case basis in current protocols—the question remains whether the additional burden in data collection (eg, measurements of new lesions, assessment of non-target growth from the previous assessment, and confirmation of progression) and data evaluation (eg, programming patient response post initial RECIST V.1.1 progression and a more complicated algorithm regarding what constitutes a confirmation of progression) is outweighed by any impact that prospective use of iRECIST will have on the interpretation of tumor measurement-based endpoints such as ORR and PFS. While the magnitudes of treatment effects on ORR and PFS, as well as associations of PFS with OS, as calculated by iRECIST or RECIST V.1.1 appear similar, the substantial proportion of missing data following RECIST V.1.1 progression in this retrospective analysis limits firm conclusions of the utility of iRECIST for evaluation of immunotherapeutics in trials intended to support marketing authorization.
In prospective trials proposing the use of an immune-based tumor response criteria such as iRECIST for the determination of the primary endpoint, the trial design, trial conduct, and analysis plan would need to minimize and address potential bias introduced with the use of such criteria. Given its widespread acceptance, iRECIST has the potential to unify how data are collected and evaluated across both single-arm and randomized trials, providing a new standard for tumor response assessment across trials.
FM and MRT contributed equally.
Contributors FM, PK, MRT, RP, and RS conceived and designed the study, FM performed the data analysis and produced the figures and tables. All authors interpreted the data and contributed to writing the report.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Patient consent for publication Not required.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement No data are available. Trial data for this analysis was pulled from submissions evaluating the safety and efficacy of an anti-PD-1/PD-L1 antibody submitted to FDA between Sept. 2014 and Sept. 2017.