Article Text
Abstract
Background Immune checkpoint blockade (ICB) has revolutionized our approach to cancer treatment. However, the response rate of immune checkpoint blockade (ICB) is still low. With the accumulation of large-scale ICB data, efforts to use these data to build machine learning predictors of ICB response are rising. However, there are several shared concerns about these models, including their black box nature that limits interpretability and the potential risk of overfitting during model training, which have so far impeded their clinical translation.
Methods Here we analyzed ~ 3000 samples across 18 solid tumor types from multiple cohorts with more than 20 clinical, pathologic, and genomic features measured. We developed, trained, and evaluated 20 machine-learning models to identify the most predictive model for ICB response by comparing their performance on test sets and importantly, performance difference between training vs test sets, using a repeated cross-validation procedure. The machine learning models include decision trees, Gaussian processes, support vector machine, XGBoost, and deep neural networks, among others. Finally, we developed the LOgistic Regression-based Immunotherapy-response Score (LORIS) using a transparent, compact 6-feature logistic LASSO regression model. This approach was validated for developing both pan-cancer and NSCLC-specific models across multiple independent datasets (figure 1).
Results The linear LASSO regression model outperforms all other models and biomarkers by having the highest performance on cross-validation sets and notably, the smallest performance difference between training and cross-validation sets (figure 2). LORIS outperforms previous signatures in ICB response prediction and can identify patients more likely to respond to ICB treatment, importantly, even those with low TMB or tumor PD-L1 expression levels (figure 3). LORIS consistently predicts both the short-term and the long-term survival across almost all cancer types (figure 3). Most importantly, ICB response probability increases near-monotonically (from 0% to 100%) with the LORIS, which can be used in both patient inclusion and exclusion. In contrast, ICB response probability is ~20% in low TMB patients and not always higher with higher TMB (figure 4). Finally, this approach is also effective in developing cancer-type-specific models for predicting ICB response (figure 5).
Conclusions Our study identifies important clinical features linked to ICB response and survival, allowing for more accurate and interpretable predictions using just a few clinically readily measurable features. We expect that this method will help improve clinical decision-making practices in precision medicine to maximize patient benefit.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See http://creativecommons.org/licenses/by-nc/4.0/.