Article Text
Abstract
Background Multiplex immunofluorescence (mIF) allows simultaneous spatial interrogation of multiple cell- and tissue-based biomarkers from patient cohorts at scale using whole-slide images (WSI). Identification of spatially-derived insights is limited by conventional approaches that reduce spatial data into human-derived feature sets (e.g., nearest neighbor), necessitating new methods for surveying spatial patterns in full. Here, we utilized a graph neural network (GNN) to identify clinically-relevant, spatially-defined tissue phenotypes with distinct immunogenic profiles in non-small cell lung cancer (NSCLC) and compare this method to traditional approaches.
Methods To identify relevant cancer, stromal, and immune cell types, mIF for cytokeratin, CD8, FoxP3, myeloperoxidase, CD68/CD163, and fibroblast activation protein-A was performed with a DAPI counterstain on clinical NSCLC specimens (N=165). From the same cohort, we extracted bulk mRNAseq and proteomic signatures. mIF images were acquired and spectrally unmixed using the Phenoptic platform and Inform software (Akoya). A convolutional neural network was trained to segment regions of cancer epithelium, stroma, and necrosis, while a pretrained network segmented all cells (HALO-AI, Indica Labs). Cells were converted into graphs, and a GNN autoencoder was trained to discover tissue patterns defined by spatial arrangement and mIF cell/tissue phenotypes. Hierarchical clustering identified patient subsets based on tissue patterns, and Cox proportional hazard models assessed overall survival (OS) (figure 1).
Results Segmentation and post-processing of 165 mIF WSIs resulted in graphs describing 243,646 cells/WSI (mean; range: 3,279-1,552,934). Training the GNN autoencoder on this dataset revealed three tissue phenotypes: (1) cancer epithelium, (2) cancer stroma and epithelial-stromal interface (ESI) with high inflammation, and (3) cancer stroma and ESI with low inflammation. Samples were then classified based on the relative abundance of these phenotypes. While conventional analyses (e.g., nearest neighbor) failed to identify subgroups with prognostic value, our model revealed prolonged OS (HR=0.46 [0.22, 0.96]) in patients with abundant phenotype 2. Furthermore, patients in this subgroup displayed significant overexpression of the MHC-I antigen presentation pathway and downregulation of a general immune exclusion signature, potentially increasing intrinsic anti-tumor immunity to improve prognosis.
Conclusions We developed a novel, unsupervised deep-learning-based GNN for analysis of WSI mIF data in NSCLC. This method identified interpretable tissue phenotypes with clinically-relevant differences in anti-tumor immunity and prognosis. These data show the utility of unsupervised spatial deep-learning methods, compared to traditional approaches, for data-driven discovery of complex patterns in large-scale multiplex images.
Ethics Approval This study was performed in accordance with the Bristol Myers Squibb Bioethics policy (https://www.bms.com/about-us/responsibility/position-on-key-issues/bioethics-policy-statement.html) and adhered to the World Medical Association Declaration of Helsinki for Human Research.