Article Text

Download PDFPDF

1291 Multi-modal deep learning integrating radiology and pathology images to predict cancer immunotherapy response: a retrospective multi-cohort study
  1. Yuming Jiang1,
  2. Zhe Li2 and
  3. Ruijiang Li2
  1. 1Stanford University School of Medicine, Mountain View, CA, USA
  2. 2Stanford University, Palo Alto, CA, USA
  • Journal for ImmunoTherapy of Cancer (JITC) preprint. The copyright holder for this preprint are the authors/funders, who have granted JITC permission to display the preprint. All rights reserved. No reuse allowed without permission.


Background There is a critical unmet need for predictive biomarkers of cancer immunotherapy. The tumor microenvironment (TME) plays an important role in determining immunotherapy response and outcomes. Here, we aimed to develop and validate a multi-modal deep learning model that integrates routine histopathology and radiology images to predict TME status and immunotherapy response in gastric cancer patients.

Methods In this retrospective multi-cohort study, we developed a multitask deep learning model for the simultaneous prediction of TME status and disease-free survival using CT images and whole slide haematoxylin and eosin (H&E)-stained images. Four classes of TME were defined according to five immune biomarkers assessed by immunohistochemistry. We trained a deep convolutional neural network model using the training cohort of 320 patients and tested the model in the internal and external validation cohorts. To fuse CT and H&E images, we first concatenate the density map of different types as the global feature of the whole-slide image and the CT, resulting in a multi-channel image. Then we feed this multi-channel image into the classification network for training and testing. We compared model performance with other deep learning models (single modal models, and single task models). Further, we evaluated the model’s association with prognosis and ability to predict immunotherapy response. Multivariable analysis was performed using a logistic regression method to see how different parameters (gender, age, smoking status, image model) affected immunotherapy response.

Results The deep-learning model achieved a high accuracy for assessing TME status in both internal and external validation cohorts (AUC 0.81–0.85). The performance of multi-modal multi-task models was superior to single modal models and single task models in the validation cohorts. The deep-learning model was significantly associated with disease-free survival and overall survival in all cohorts (p<0.0001). The model remained an independent prognostic factor adjusting for clinicopathological variables including tumor size, stage, differentiation, and Lauren histology. Moreover, the deep-learning model achieved a higher accuracy (AUC 0.78) for prediction of immunotherapy response than programmed death-ligand 1 combined positive score (CPS), and their combination further improved the response prediction accuracy.

Conclusions The multi-modal multi-task deep-learning model could allow for accurate evaluation of TME status from CT and H&E images. Furthermore, the deep-learning model predicted immunotherapy response and outcomes in gastric cancer. Exploration of TME in multi-modal data may be a promising avenue for precision immunotherapy.

Ethics Approval Ethical approval was obtained from the institutional review boards of the participating centers, and patient consent was waived for this retrospective analysis.

Consent Ethical approval was obtained from the institutional review boards of the participating centers, and patient consent was waived for this retrospective analysis.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.