Article Text
Abstract
Background Spatial transcriptomics and proteomics measurements enable high-dimensional characterization of tissues at subcellular levels, but understanding the larger-scale spatial organization of cells and extracting tissue structures of interest remain challenging tasks that require extensive human annotations. This challenge is particularly difficult in the context of cancers, where tissues structures are heterogeneous and may be defined by molecular subtypes, as well as morphological features.
Methods To address this need for consistent identification of tissue structures, in this work, we present a novel annotation method Spatial Cellular Graph Partitioning (SCGP) that allows unsupervised identification of tissue structures that reflect the anatomical and functional units of human tissues. SCGP performs partitioning of tissue sections by conducting community detection on the graphical representations of tissue regions. Input graphs are constructed based on nodes that represent spatial units (e.g., cells) and edges connecting nodes that are spatially-close or share similar features. We further present a reference-query extension pipeline based on SCGP that enables the generalization of existing tissue structures to previously unseen samples by inserting pseudo-nodes representing reference partitions into the input graphs.
Results Our experiments demonstrate reliable and robust partitioning on multiple datasets encompassing different tissue types including kidney, brain and skin sections. SCGP outperforms existing unsupervised annotation tools in recognizing compartments in kidney tissues with various degrees of diabetic kidney disease, and it achieves comparable accuracy in segmenting human brain sections as the state-of-the-art spatial transciptomics analysis tools. The extension pipeline is evaluated under a variety of conditions and is capable of addressing common challenges such as noise, artifact, batch effects, and phenotypic differences. Downstream analysis on SCGP-identified tissue structures reveals disease-relevant biological insights, underscoring its potential in facilitating biomedical research and driving new discoveries.
Conclusions We demonstrate a rapid, unsupervised method to identify molecularly and spatially distinct tissue structures in spatial biology datasets. This method could aid in characterizing complex samples, such as tumor microenvironments, on multiple scales.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See http://creativecommons.org/licenses/by-nc/4.0/.