Article Text

Download PDFPDF

46 Comprehensive machine learning-driven platform infers key tumor characteristics from blood-derived cfRNA
  1. Andrey Shubin,
  2. Boris Shpak,
  3. Anastasiya Yudina,
  4. Elena Bushmanova,
  5. Maria Savchenko,
  6. Anastasiya Danchurova,
  7. Svetlana Bezlepkina,
  8. Ekaterina Ushakova,
  9. Polina Shulga,
  10. Kirill Shaposhnikov,
  11. Daniil Litvinov,
  12. Augustus W Shuster,
  13. Anastasiia G Tarabarova,
  14. Alexander Zaytcev and
  15. Michael F Goldberg
  1. BostoGene, Corp., Waltham, MA, USA
  • Journal for ImmunoTherapy of Cancer (JITC) preprint. The copyright holder for this preprint are the authors/funders, who have granted JITC permission to display the preprint. All rights reserved. No reuse allowed without permission.

Abstract

Background Recent progress in high quality sequencing of circulating nucleic acids makes liquid biopsy an efficient approach to monitor tumor evolution and therapy response. Cell-free RNA (cfRNA) from blood and other biofluids contains a tumor-derived fraction,1 offering a minimally invasive tool to characterize tumor-related transcriptomic states. Here, we present a comprehensive machine learning (ML)-driven platform for analysis of blood-derived cfRNA to infer clinically important features and biomarkers of malignancies.

Methods cfRNA was extracted from 4 mL of double-spun plasma (n = 232 healthy and n = 92 breast, 36 lung, 23 pancreatic, and 17 colorectal cancer cases). NGS libraries were prepared according to the Agilent XT HS2 protocol using the V8+UTR exome-wide panel. Pisces 5.2 and samtools mpileup tools were used to call tumor-specific mutations from cfRNA. Abundance of transcripts from cancer-specific signatures was analyzed using gene set enrichment analysis (GSEA) and single-sample GSEA. ML decision tree-based models were trained on artificial data generated from open source bulk RNA-seq data from cancer cells, tissues, and sorted cells collected across the GEO database. Model testing was performed on real cfRNA sequences (n = 232 healthy, n = 168 cancer cases).

Results We developed robust protocols for plasma-derived cfRNA extraction and NGS library preparation for reproducible interpatient and intrapatient cfRNA transcriptome profiling (figure 1). cfRNA profiles from cancer patients contained mRNA transcripts carrying tumor-specific hotspot mutations demonstrating a tendency to moderate positive correlation between tumor and cfRNA variant allele frequencies (VAFs; R = 0.41, p = 0.064; figure 2), and profiles were also enriched with epithelial, epithelial-mesenchymal transition, senescence, and angiogenesis signatures (figure 3). We employed an ML-driven approach to infer tumor-specific characteristics from the tumor-derived cfRNA fraction for breast, colorectal, lung, and pancreatic cancers. ML models trained with artificial cfRNA transcriptomes accurately detected the status of breast cancer (AUC = 0.73 ± 0.05, n = 153), tumor microenvironment fibrosis (AUC = 0.80 ± 0.07, n = 44), predicted PD-1 (AUC = 0.71 ± 0.03, n = 78), and liver metastasis (AUC = 0.70 ± 0.02, n = 143) when tested in clinical patient samples (figure 4).

Conclusions The presented cfRNA-based platform offers unprecedented insight into the tumor biology compared to liquid biopsy assays used in current clinical practice. The proposed platform is universal and can potentially characterize any tumor-associated process accompanied by transcriptomic changes reflected in the cfRNA fraction.

Reference

  1. Larson MH, Pan W, Kim HJ, et al. A comprehensive characterization of the cell-free transcriptome reveals tissue- and subtype-specific biomarkers for cancer detection. Nat Commun 2021;12:2357.

Abstract 46 Figure 1

High reproducibility of cfRNA sequencing results on intra- and inter-patient levels

Abstract 46 Figure 2

Tumor-specific mutations in tumor-derived cfRNA transcripts. For 18 samples, 21 tumor-specific WES-matched hot spot mutations were detected in the cfRNA fraction, indicating the presence of tumor signal

Abstract 46 Figure 3

Tumor-specific signatures in cfRNA. Results from ssGSEA reveal a pronounced epithelial signature in cfRNA from patients with various types of carcinomas compared to healthy individuals and sarcoma patients

Abstract 46 Figure 4

ROC-AUC of cfRNA-based models. ROC-AUC validates efficacy in distinguishing between healthy and diseased states, detecting TME fibrosis status in breast cancer, identifying PD-1 biomarkers in a pan-cancer cohort, and revealing liver metastasis; number shows real patient cases

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.