Article Text

Download PDFPDF

1303 Prediction of best response for NSCLC patients receiving immunotherapy by machine learning models
  1. Yili Zhang1,
  2. Samir Gupta1,
  3. Anas Belouali1,
  4. Shaked Lev-Ari1,
  5. Neil Shah2,
  6. Kanchi Krishnamurthy1,
  7. Micheal Serzan1,
  8. Adil Alaoui1,
  9. Peter McGarvey1,
  10. Michael Atkins1 and
  11. Subha Madhavan1
  1. 1Georgetown University, Washington DC, DC, USA
  2. 2Memorial Sloan Kettering Cancer Center, New York, NY, USA


Background Immune checkpoint inhibitors (ICIs) have been used to treat many distinct cancers, including non-small cell lung cancer (NSCLC). Better tools are needed to predict which patients will benefit from ICI therapy. This study aims to use machine learning (ML) models to predict the tumor response of patients with NSCLC to immunotherapy.

Methods The Georgetown-Lombardi Comprehensive Cancer Center has developed a centralized Immuno-Oncology (IO) registry encompassing patients treated with ICI within the MedStar Health hospital system from 2011 until April 2018. Data on demographics, immunotherapy information, and lab tests after diagnosis for 220 NSCLC patients were collected from the registry. Responses included complete response (CR), partial response (PR), progressive disease (PD), and stable disease (SD) at 12 weeks from starting the immunotherapy. In this study, we predicted if patients responded to the immunotherapy (CR and PR) or not (PD and SD). Ten ML models were employed for binary prediction with five-fold cross-validation adopted. The area under the receiver-operating curve (AUROC) was used to assess ML model, and important features affecting the model were subsequently analyzed.

Results Among curated patients, 74 (33.64%) responded to the immunotherapy. Nivolumab was used in 107 (48.38%) patients, followed by carboplatin + pembrolizumab + pemetrexed in 20 (9.09%) patients. Ten ML models were performed such as logistic regression, support vector machine, naive Bayesian (NB), random forest, and Bernoulli NB. The AUROC values for the top three performing models were logistic regression (77.91%), naive Bayesian (73.93%), and random forest (73.88%). The 3 most important features selected from the logistic regression model are A/G ratio, line of therapy, and pre-treatment ECOG PS. Features considered unimportant include age, sex, BMI, ALT (SGPT), AST (SGOT).

Conclusions This study leveraged ML algorithms to predict the tumor response to ICI therapy for patients with NSCLC. This novel approach utilizing EMR in the computational models can help predict the outcome of ICI therapy in patients with NSCLC. The best prediction model had an AUROC score of 77.91%. The limitations of this study are the relatively small sample size and the lack of molecular information. These will be addressed with the updating and expansion of the IO registry participation of other institutions and linking to a growing collection of omics data together leading to more robust ML models.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.