Article Text

Download PDFPDF

1277 Identification of clinically relevant spatial tissue phenotypes in large-scale multiplex immunofluorescence data via unsupervised graph learning in non-small cell lung cancer
  1. Robert Egger1,
  2. Andrew Fisher2,
  3. Michael Drage1,
  4. Jimena Trillo-Tinoco2,
  5. John Abel1,
  6. Andrew Browne2,
  7. Deepta Rajan1,
  8. Tai Wang2,
  9. Jake Conway1,
  10. Catherine King2,
  11. Jacqueline Brosnan-Cashman1,
  12. Anne Lewin2,
  13. Arnaud Amzallag2,
  14. Thomas Lila2,
  15. Tyler Simpson2,
  16. Mike Montalto1,
  17. Benjamin Chen2,
  18. Benjamin Glass1 and
  19. Vipul Baxi2
  1. 1PathAI, Boston, MA, USA
  2. 2Bristol Myers Squibb, Cambridge, MA, USA


Background Multiplex immunofluorescence (mIF) allows simultaneous spatial interrogation of multiple cell- and tissue-based biomarkers from patient cohorts at scale using whole-slide images (WSI). Identification of spatially-derived insights is limited by conventional approaches that reduce spatial data into human-derived feature sets (e.g., nearest neighbor), necessitating new methods for surveying spatial patterns in full. Here, we utilized a graph neural network (GNN) to identify clinically-relevant, spatially-defined tissue phenotypes with distinct immunogenic profiles in non-small cell lung cancer (NSCLC) and compare this method to traditional approaches.

Methods To identify relevant cancer, stromal, and immune cell types, mIF for cytokeratin, CD8, FoxP3, myeloperoxidase, CD68/CD163, and fibroblast activation protein-A was performed with a DAPI counterstain on clinical NSCLC specimens (N=165). From the same cohort, we extracted bulk mRNAseq and proteomic signatures. mIF images were acquired and spectrally unmixed using the Phenoptic platform and Inform software (Akoya). A convolutional neural network was trained to segment regions of cancer epithelium, stroma, and necrosis, while a pretrained network segmented all cells (HALO-AI, Indica Labs). Cells were converted into graphs, and a GNN autoencoder was trained to discover tissue patterns defined by spatial arrangement and mIF cell/tissue phenotypes. Hierarchical clustering identified patient subsets based on tissue patterns, and Cox proportional hazard models assessed overall survival (OS) (figure 1).

Results Segmentation and post-processing of 165 mIF WSIs resulted in graphs describing 243,646 cells/WSI (mean; range: 3,279-1,552,934). Training the GNN autoencoder on this dataset revealed three tissue phenotypes: (1) cancer epithelium, (2) cancer stroma and epithelial-stromal interface (ESI) with high inflammation, and (3) cancer stroma and ESI with low inflammation. Samples were then classified based on the relative abundance of these phenotypes. While conventional analyses (e.g., nearest neighbor) failed to identify subgroups with prognostic value, our model revealed prolonged OS (HR=0.46 [0.22, 0.96]) in patients with abundant phenotype 2. Furthermore, patients in this subgroup displayed significant overexpression of the MHC-I antigen presentation pathway and downregulation of a general immune exclusion signature, potentially increasing intrinsic anti-tumor immunity to improve prognosis.

Conclusions We developed a novel, unsupervised deep-learning-based GNN for analysis of WSI mIF data in NSCLC. This method identified interpretable tissue phenotypes with clinically-relevant differences in anti-tumor immunity and prognosis. These data show the utility of unsupervised spatial deep-learning methods, compared to traditional approaches, for data-driven discovery of complex patterns in large-scale multiplex images.

Ethics Approval This study was performed in accordance with the Bristol Myers Squibb Bioethics policy ( and adhered to the World Medical Association Declaration of Helsinki for Human Research.

Abstract 1277 Figure 1

Study workflow.From left to right: Graphs are constructed from cell locations, morphology, mIF and tissue features. An unsupervised GNN learns to identify characteristic tissue phenotypes defined by these features and the spatial arrangement of cells. Finally, relative abundance of the different phenotypes is used to stratify patients into groups. Notably, these groups are associated with differences in overall survival

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.