Article Text
Abstract
Background Tumor Immuno-MicroEnvironment (TIME) is characterized by a heterogeneous interplay of cellular and molecular components. For the TIME of cancers caused by virus infection, the comparison of immune cells close to and far from viral infection, crucial for understanding localized immune response and developing targeted therapies, has not been not possible to investigate until the advent of spatial transcriptomic techniques. Here, we proposed a methodology to localize viral infection sites using SpaTial Enhanced REsolution Omics-sequencing (Stereo-seq, figure 1)1 data of virus-associated cancers. We used Epstein-Barr virus (EBV)-associated Nasophayngeal Carcinoma (NPC)2 3 and human Hepatitis B Virus (HBV)-associated hepatocellular carcinoma (HCC)4 5 as two paradigmatic examples to show the robustness of this methodology.
Methods We ran Stereo-seq experimental protocol separately for fresh-frozen NPC and LELC samples. In each run, we added one virus-free cancer fresh-frozen sample as control Given the low read depth per Nanoball, a grid of 100 x 100 Nanoballs (BIN100) were used as the unit of analysis to ensure that there is sufficient read depth (figure 1A). For each BIN100, we used Stereo-Seq Analysis Workflow (SAW) pipeline v5.5.2 to align its reads to human genome GRCh38.p13 for Seurat cell clustering and cell type annotation using Bioconductor packages EasyCellType and SingleR. We used STAR v2.7.10b to map reads unaligned from human genome to EBV-1 or HBV genome from NCBI RefSeq. Finally, we superimposed BIN100 with high viral reads onto SAW-registered ssDNA fluorescent-staining image. In addition, we performed Hematoxylin and Eosin (H&E) staining, and QuPath tissue annotation.
Results The overlaid images of virus-positive BIN100s and ssDNA tissue-staining illustrated a clear contrast between virus-positive and virus-negative cancer samples (figures 2A, 3A). Most EBV1-positive BIN100s are in the invasive margin indicated by QuPath annotation of H&E staining (figure 2A-B), which is as expected. Most cells surrounding viral infection sites are B cell/plasma cell clusters 5,9,14 for NPC, and monocyte clusters 13,14,15 for HCC (figures 2,3). The clear separation of these clusters from other immune cell clusters illustrates that the distance to viral infection site significantly alters the gene expression profile of immune cells.
Conclusions Our study proposes a powerful virus localization method to uncover the fine structure of TIME contributed by virus infection. In future, we will test this method with more NPC and HCC samples, and more virus-free cancer samples to validate its robustness. We will continue differential gene expression analysis for immune cell clusters with different distance to viral infection sites.
References
Chen A, Liao S, Cheng M, Ma K, Wu L, Lai Y, et al. Spatiotemporal transcriptomic atlas of mouse organogenesis using DNA nanoball-patterned arrays. Cell. 2022;185:1777–1792.
Young L, Yap L, Murray P. Epstein-Barr virus: more than 50 years old and still providing surprises. Nat Rev Cancer. 2016;16:789–802.
Png YT, Yang AZY, Lee MY, Chua MJM, Lim CM. The Role of NK Cells in EBV Infection and EBV-Associated NPC. Viruses. 2021 Feb 15;13(2):300.
Chen CJ, Yang HI, Su J, Jen CL, You SL, Lu SN, Huang GT, Iloeje UH; REVEAL-HBV Study Group. Risk of hepatocellular carcinoma across a biological gradient of serum hepatitis B virus DNA level. JAMA. 2006 Jan 4;295(1):65–73.
Liu Z, Jiang Y, Yuan H, Fang Q, Cai N, Suo C, Jin L, Zhang T, Chen X. The trends in incidence of primary liver cancer caused by specific etiologies: Results from the Global Burden of Disease Study 2016 and implications for liver cancer prevention. J Hepatol. 2019 Apr;70(4):674–683.
Ethics Approval This study was approved by the Agency of Science, Technology and Research (A*STAR) Human Biomedical Research Office (A*STAR IRB: 2021-161, 2021-188, 2021-112).
Consent De-identified patient data was used in our work. Samples were collected with consent from patients.
Briefing of Stereo-seq technology and applying Stereo-seq experimental workflow. (A) Graphical illustration of Stereo-seq chip - a DNA nanoball (DNB)-patterned array with sub-cellular resolution. Circular templates with different coordinate identifier (CID) are amplified separately by rolling circle amplification (RCR) using Phi29 DNA polymerase to form DNA nanoballs in separate spots. The diameter of each spot is 220nm, and the spot center-center distance is 500nm. In SAW data analysis, a square bin (BIN1) contains spot, sequencing reads in the spot and surrounding area. To decrease sequencing error in one single spot due to low read counts in each BIN, users typically need to combine adjacent square bins into a large square bin (BIN3 = 3x3 BIN1 spots in this illustration). Most studies use BIN50, BIN100 or BIN200. (B) The mechanism of Stereo-seq experimental protocol. First, oligonucleotide probes with CID, UMI and Poly-T are fixed onto the spot by hybridizing with circular single-stranded DNAs in the fixed DNB. Next, fresh-frozen tissue slide is placed onto the Stereo-seq chip, and tissue poly(A) mRNAs are hybridized with fixed oligoucleotide probes. The subsequent steps include reverse transcription, amplification, cDNA library preparation and sequencing. The sequencing machine will export the raw results as fastq files. Finally, Stereo-seq Analysis Workflow (SAW) will use a chip-specific mask file to convert CIDs to x and y coordinates on the chip to enable spatial localization of sequencing reads. (C) Stereo-seq results are pair-ended. For each read pair, read 1 contains CID and UMI information. Read 2 contains 100nt sequence of the cDNA of interest. (D) To date, the Stereo-seq experimental protocol does not allow applying hematoxylin and eosin (H&E) staining and in-situ sequencing on the same tissue section. Instead, H&E staining needs to be done on a nearby tissue section. The caveat of this two-layer protocol is that for some fragile tissue blocks, the outline of the tissue may be discordant (see figure 3). The next version of experiment protocol will resolve this issue and allows applying H&E staining and in-situ sequencing on the same tissue section
Applying Stereo-seq experimental protocol, Stereo-seq Bioinformatics workflow (SAW) to an Epstein-Barr Virus (EBV)-positive Nasopharyngeal Carcinoma (NPC) sample. (A) Localization of EBV-1 reads on Stereo-seq chip indicated clear contrast between EBV-positive and EBV-negative cancers. We did Stereo-seq analysis for the EBV-positive NPC sample and an EBV- negative Oral Squamous Cell Carcinoma (OSCC). Red circle illustrates 10 BIN100 with the highest number of reads (≥100) reads mapped to EBV-1 genome. The background image is the ssDNA fluorescence image from the tissue section for in-situ sequencing, instead of the tissue section undergone H&E staining (figure 1D). (B) QuPath annotation (bottom) of the H&E image (top) of the NPC sample. Tumor region is colored as red in QuPath image and dark purple in H&E staining; while stroma is colored as green in QuPath image and light pink in H&E staining. (C) 18 clusters of BIN100s were found using function ‘Seurat::FindClusters(resolution=2.2)’. All clusters were contiguous in coordinate space. (D) Seurat clusters of NPC sample show clear localization relative to QuPath annotation. All the clusters are only or mostly located in tumor, stroma or invasive margin: Cluster 2,3,4,15 primarily represented stroma (colored in green); clusters 5,9,14 which were primarily plasma cells were primarily represented invasive margin (boundary between stroma and tumor area, colored in orange); the rest of the 4 clusters were colored in red. (E) t-SNE plot showed 18 clusters generated by Seurat were contiguous in the t-SNE space. Contiguity in both coordinate space (panel C) and t-SNE space (this panel) indicated high quality of the clustering result. (F) t-SNE plot of the clusters located in tumor, stroma, and invasive margin (tumor - red, invasive margin - orange, stroma - green)
Applying Stereo-seq experimental protocol, Stereo-seq Bioinformatics workflow (SAW) to a Human Hepatitis B Virus (HBV)-positive HepatoCellular Carcinoma (HCC) sample. (A) Localization of HBV reads on Stereo-seq chip indicated clear contrast between HBV-positive and HBV-negative cancers. We did Stereo-seq analysis for the HBV-positive HCC sample and an HBV-negative ColoRectal Carcinoma (CRC). Red circle illustrates the BIN100 with top 9 reads (≥360) mapped to HBV genome. The background image is ssDNA fluorescence image from the tissue section for in-situ sequencing. We accept some BIN100s to be outside of the tissue contour of the ssDNA image at the top-right of the panel, because tissue morphologies are different on the two sections. Panels A and C together showed that clusters 13,15 (monocyte) and cluster 14 (hepatocyte) are most proximal to HBV infection sites. The concentration of cluster 13 and 15 around HBV infection site illustrated that Seurat clustering successfully distinguishes monocytes proximate to and remote to HBV site. (B) QuPath annotation (left) of the H&E image (right) of the HCC sample. Tumor region is colored as red in QuPath image and dark purple in H&E staining; while stroma is colored as green in QuPath image and light pink in H&E staining. The outline of the HCC tissue on the H&E image is different from the outline of HCC tissue on the Stereo-seq chip due to the limitation of the current experiment protocol (figure 1D). (C) We found 19 clusters of BIN100s using function ‘Seurat::FindClusters(resolution= 2.2)’. Cell types annotation was done by Bioconductor packages EasyCellType and SlngleR. All clusters were contiguous in coordinate space. (D) The localization BIN100 clusters with respect to the QuPath annotation is not so apparent as what it is in NPC sample. Here we did a manual localization: clusters 1,5,9–12,16–18 primarily represented stroma (colored in green); the rest of the other 10 clusters were colored in red. (E) t-SNE plot showed 19 clusters generated by Seurat were contiguous in the t-SNE space. Contiguity in both coordinate space (panel C) and t-SNE space (this panel) indicated high quality of the clustering result. The cell type annotation was labeled on the figure. (F) t-SNE plot of the clusters located in tumor and stroma (tumor - red, stroma - green)
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See http://creativecommons.org/licenses/by-nc/4.0/.