Background The need for a concise and objective immune-specific gene set database is crucial in the era of immune checkpoint blockade (ICB) and adoptive cell cancer (ACT) treatments. It is essential for immunologists to understand treatment mechanisms, molecular distinctions between responders and non-responders, and drivers underlying better survival. However, the current lack of such gene sets hampers immunological research, as existing immune pathway databases are limited and carelessly exploited. Objectively constructed and immunologically relevant pathways provide immunologists with unbiased enrichment results and greater clinical interpretability.
Methods We collected 83 Bulk-RNAseq datasets from the Molecular Signature Database C7. These datasets contain samples challenged with infections of different kinds and magnitudes, possessing yet-to-be discovered immune functions that lie beneath the transcriptomic profiles. Using non-negative matrix factorization (NMF), we identified gene sets with coordinated expression, curated robust NMF programs and merged into meta programs based on Jaccard metric. We validated the clinical utilities of these gene sets with Cancer Genome Atlas Program (TCGA) pan-cancer, a melanoma ICB cohort and a 10X Genomics Visium FFPE Human Breast Cancer spatial slide.
Results 19 lymphoid and 9 myeloid novel gene sets were constructed (table 1), describing diverse range of immune functions. We confirmed their functions with relevant single cell RNA and T cell receptor sequencing data. These gene sets not only recovered the TCGA immune subtypes (figure 1A) but also defined a novel immune-microenvironment subtype (figure 2) with lowest aneuploidy, TCR diversity, and neoantigen loads but significantly preferable survival (figure 1C,D). These gene sets also provided better discriminatory power for ICB response (figure 1E) and alluded that ICB non-response is pre-destined with high activities in these gene sets at baseline, suggesting possible T cell exhaustion that is irreversible by ICB (figure 1F,G). A risk score derived from these gene sets has better prognostic power in TCGA survival data (figure 1H). Lastly, these gene sets accurately delineate the tumor-immune boundaries in the H&E sections in breast cancer spatial data (tables 1 and 2).
Conclusions The translational utilities of these gene sets in diverse cancer contexts are promising, as gene sets were derived mainly from sepsis experiments, suggesting similarities in immune microenvironment between cancerous and sepsis conditions, assuring the wide applicability of these gene sets in cancer research across various domains. Through the study of gene set activities, immunologists can better understand the immune microenvironment, the drivers behind cancer survival, dissect the ICB treatment mechanism and potentially overcome therapeutic resistance.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.