Article Text

Original research
Single-cell sequencing on CD8+ TILs revealed the nature of exhausted T cells recognizing neoantigen and cancer/testis antigen in non-small cell lung cancer
  1. Hiroyasu Komuro1,2,
  2. Shuichi Shinohara1,3,
  3. Yasunori Fukushima1,2,
  4. Ayako Demachi-Okamura1,
  5. Daisuke Muraoka1,
  6. Katsuhiro Masago4,
  7. Takuya Matsui1,
  8. Yusuke Sugita1,
  9. Yusuke Takahashi1,
  10. Reina Nishida1,
  11. Chieko Takashima1,
  12. Takashi Ohki5,
  13. Yoshiki Shigematsu5,
  14. Fumiaki Watanabe6,
  15. Katsutoshi Adachi6,
  16. Takashi Fukuyama7,
  17. Hiroshi Hamana8,
  18. Hiroyuki Kishi8,
  19. Daiki Miura9,
  20. Yuki Tanaka9,
  21. Kousuke Onoue9,
  22. Kazuhide Onoguchi9,
  23. Yoshiko Yamashita9,
  24. Richard Stratford10,
  25. Trevor Clancy10,
  26. Rui Yamaguchi11,12,
  27. Hiroaki Kuroda3,
  28. Kiyoshi Doi2,
  29. Hisashi Iwata2 and
  30. Hirokazu Matsushita1,13
  1. 1Division of Translational Oncoimmunology, Aichi Cancer Center Research Institute, Nagoya, Japan
  2. 2Department of General Thoracic Surgery, Gifu University School of Medicine Graduate School of Medicine, Gifu, Japan
  3. 3Department of Thoracic Surgery, Aichi Cancer Center Hospital, Nagoya, Japan
  4. 4Pathology and Molecular Diagnostics, Aichi Cancer Center Hospital, Nagoya, Japan
  5. 5Department of Respiratory Surgery, Ichinomiya Nishi Hospital, Ichinomiya, Japan
  6. 6Department of Thoracic Surgery, Mie Chuo Medical Center, Tsu, Japan
  7. 7Division of Biomedical Research, Kitasato University Medical Center, Kitamoto, Japan
  8. 8Department of Immunology, University of Toyama, Toyama, Japan
  9. 9Drug Development Division, NEC Corporation, Minato-ku, Japan
  10. 10NEC OncoImmunity AS, Oslo Cancer Cluster, Oslo, Norway
  11. 11Division of Cancer Systems Biology, Aichi Cancer Center Research Institute, Nagoya, Japan
  12. 12Division of Cancer Informatics, Nagoya University Graduate School of Medicine, Nagoya, Japan
  13. 13Division of Cancer Immunogenomics, Nagoya University Graduate School of Medicine, Nagoya, Japan
  1. Correspondence to Dr Hirokazu Matsushita; h.matsushita{at}; Dr Hisashi Iwata; ihisashi{at}


Background CD8+tumor infiltrating lymphocytes (TILs) are often observed in non-small cell lung cancers (NSCLC). However, the characteristics of CD8+ TILs, especially T-cell populations specific for tumor antigens, remain poorly understood.

Methods High throughput single-cell RNA sequencing and single-cell T-cell receptor (TCR) sequencing were performed on CD8+ TILs from three surgically-resected lung cancer specimens. Dimensional reduction for clustering was performed using Uniform Manifold Approximation and Projection. CD8+ TIL TCR specific for the cancer/testis antigen KK-LC-1 and for predicted neoantigens were investigated. Differentially-expressed gene analysis, Gene Set Enrichment Analysis (GSEA) and single sample GSEA was performed to characterize antigen-specific T cells.

Results A total of 6998 CD8+ T cells was analyzed, divided into 10 clusters according to their gene expression profile. An exhausted T-cell (exhausted T (Tex)) cluster characterized by the expression of ENTPD1 (CD39), TOX, PDCD1 (PD1), HAVCR2 (TIM3) and other genes, and by T-cell oligoclonality, was identified. The Tex TCR repertoire (Tex-TCRs) contained nine different TCR clonotypes recognizing five tumor antigens including a KK-LC-1 antigen and four neoantigens. By re-clustering the tumor antigen-specific T cells (n=140), it could be seen that the individual T-cell clonotypes were present on cells at different stages of differentiation and functional states even within the same Tex cluster. Stimulating these T cells with predicted cognate peptide indicated that TCR signal strength and subsequent T-cell proliferation and cytokine production was variable but always higher for neoantigens than KK-LC-1.

Conclusions Our approach focusing on T cells with an exhausted phenotype among CD8+ TILs may facilitate the identification of tumor antigens and clarify the nature of the antigen-specific T cells to specify the promising immunotherapeutic targets in patients with NSCLC.

  • Non-Small Cell Lung Cancer
  • Lymphocytes, Tumor-Infiltrating
  • CD8-Positive T-Lymphocytes
  • Antigens, Neoplasm

Data availability statement

Data are available upon reasonable request.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • Infiltration of CD8+ T cells into tumor tissues is often observed in lung cancer. However, the nature and tumor-specificity of these CD8+ T-cell populations is poorly understood.


  • We defined cells with an exhausted phenotype as those enriched for T-cell receptors (TCRs) recognizing candidate novel tumor antigens. T cells at various differentiation states and with diverse functions in the same exhausted T-cell cluster resulted from weaker TCR signaling on recognition of a cancer/testis antigen relative to neoantigens.


  • Our approach facilitates the identification of a broader range of tumor antigens and clarifies the nature of the tumor antigen-specific T cells to define promising new therapeutic targets for antigen-based immunotherapy.


Lung cancer is one of the most common cancers and one of the leading causes of cancer death worldwide.1 Surgery, chemotherapy, radiation therapy, targeted therapy, immunotherapy or a combination of these are available for patients with lung cancer.2 Of these, the recent development of immune checkpoint inhibitors (ICI) has revolutionized the treatment paradigm.3 Although lung cancer is recognized as one of the most ICI-sensitive cancers, this approach is still only effective in a fraction of these patients.4–6 Therefore, the development of novel therapeutic strategies and/or biomarkers for more efficacious immunotherapy of lung cancers is urgently needed.

Cytotoxic CD8+ T cells (CTLs) are critically important immune cells that can eliminate tumor cells.7 CTLs must be primed and activated, and then arrive at the tumor site to mediate antitumor responses.8 ICI can restore suppressed immune responses to tumors by inhibiting ligand-receptor interactions involving lymphocyte regulatory molecules.9 Mutation-derived neoantigens may be critical tumor-specific targets for ICI-reactivated CTLs.10 Conversely, non-mutated tumor-associated antigens with more limited tumor specificity such as cancer/testis antigens (CTA) have also been reported to be targeted by immune responses in cancer.11 Thus, the identification of tumor-specific T cells and their cognate antigens may guide the design of antigen-based immunotherapeutic strategies such as cancer vaccination or T-cell receptor (TCR)-transduced adoptive cellular immunotherapy.12 13 In addition, their characterization would also accelerate the development of biomarkers to monitor antitumor immune responses in clinical immunotherapy programs.

Several previous studies have attempted to identify tumor-specific antigens recognized by tumor infiltrating lymphocytes (TILs). Tumor-specific T cells have been intensively investigated using several activation and exhaustion markers such as programmed cell death 1 (PD-1), CD137 and CD39 expressed on the TILs,14–16 and tumor-specific antigens have been identified in vitro using T cells expanded after isolation based on the expression of these markers. Recently, single-cell sequencing has revealed heterogeneous populations of TILs in several different cancers, and it has been shown that certain T-cell phenotypes may be associated with the expression of specific TCR sequences and the presence of defined tumor antigens.17–19 Overall, these studies demonstrated that tumor-derived antigen-specific T-cell populations are clearly distinguishable from bystander T cells that may recognize viral antigens in melanoma and other cancers.

In the present study, we performed high-throughput single-cell RNA sequencing (scRNA-seq) and single-cell TCR sequencing (scTCR-seq) on CD8+ TILs to elucidate the intratumoral T cell states in surgically-resected non-small cell lung cancers (NSCLC). We investigated tumor-specific T-cell populations by phenotyping and TCR analyses, and their responses to mutation-derived neoantigens and one representative CTA. We further explored the differentiation and functional states of tumor antigen-specific T cells.


Patients and data sets

Clinical tumor samples and matched blood samples from three patients with lung cancer who underwent surgery were collected at Aichi Cancer Center, Ichinomiya Nishi Hospital and Mie-Chuo Medical Center between April 2019 and July 2020. All procedures were in accordance with the ethical standards of the institutions and with the 1964 Helsinki declaration and its later amendments, or comparable ethical standards. Written informed consent was obtained from all individual participants included in the study.

Whole-exome and RNA sequencing data

The cohort used in the current study consisted of the three cases from our previous study in which human leukocyte antigen (HLA) typing, whole-exome sequencing and RNA sequencing data were all available.20

Neoantigen prediction

The immunogenicity of tumor-specific mutations was estimated using the NEC Immune Profiler (NIP) software from NEC OncoImmunity, comprising several proprietary machine-learning (ML) prediction algorithms. The algorithm considers the following features when predicting the immunogenicity of a candidate:

  1. The binding affinity of the peptide to major histocompatibility complex/HLA. NIP exploits several binding affinity ML predictors that compute IC50 (nM) scores for each mutated peptide.

  2. The peptide’s ability to be efficiently handled by the antigen processing machinery (APM). An ensemble of 13 Support Vector Machines included in NIP and trained on validated mass spectrometry immunopeptidome data sets determine which peptides have the optimal features to be efficiently processed by the APM, which include the probability of cleavage by the proteasome and antigen processing transport (TAP) efficiency.

  3. The expression of the candidate neoantigen. The expression of each candidate was computed by summing the values (transcripts per million (TPMs) of all the isoforms coding for the specific peptides under consideration. To determine the specific abundance of the mutated peptide, the sum of the levels of all the isoforms containing the peptide was adjusted according to the variant allele frequency computed at the RNA level.

  4. The ability of the somatic mutation’s originator protein to generate peptides with adequate properties to be antigen processed.

HLA class I neoantigen predictions were made for each of the three patients for peptides (Biologica) of 9 and 10 residues.

Flow cytometry

Fresh tumor digests (FTD) were prepared using the gentleMACS tumor dissociator (Miltenyi Biotec, Auburn, California, USA), according to the manufacturer’s instructions. Cell counting was performed by manually using a hemocytometer. FTD (2–5×106 cells/tube) were cryopreserved with CP-1 (Kyokutoseiyaku, Tokyo, Japan) and kept in a deep freezer (−135°C) until use. Cryopreserved FTD (2–5×106 cells) were thawed in RPMI and then stained after blocking Fc receptors using Human BD Fc Block (BD Biosciences, San Jose, California, USA). The following monoclonal antibodies (mAbs) were used: BV421-labeled CD3, APC-labeled CD4, FITC-labeled CD8, BV711-labeled CD103, PE-labeled CD39, and BV786-labeled PD-1 (all from BioLegend, San Diego, California, USA). SYTOX AADvanced Dead Cell Stain Kit (Thermo Fisher, Waltham, Massachusetts, USA) or Zombie NIR Fixable Viability Kit (BioLegend) was used to exclude dead cells. Stained cells were analyzed using an LSRFortessa X-20 flow cytometer (BD Biosciences) and data processed using FlowJo V.10.0.7 (BD Biosciences).


Immunohistochemistry was performed on 4 μm-thick formalin-fixed paraffin-embedded sections. The following antibodies were used: CD3 (Polyclonal Rabbit, ready-to-use (RTU); Dako), CD8 (C8/144B, 1:100; Dako), PD-1 (NAT105, 1:50; Abcam), and programmed cell death ligand 1 (PD-L1) (22C3, RTU; Dako).

Preparation of single-cell complementary DNA libraries

Cryopreserved FTD were thawed in RPMI and then stained using the following mAbs: APC anti-human CD3 and FITC anti-human CD8. Zombie NIR Fixable Viability Kit (BioLegend) was used to exclude dead cells. CD3+CD8+ cells were sorted on an Aria III (BD Biosciences).

GEM (Gel Bead-In EMulsions) generation and barcoding from single-cell suspensions (5000–10,000 CD8+ T cells) was performed using the Chromium Next GEM Single Cell 5’ GEM Kit V.2 (PN-1000244) on the Chromium Next GEM Chip K Single Cell Kit (PN-1000286) according to the manufacturer’s instructions (10x Genomics, Pleasanton, California, USA). After GEMs were broken and pooled complementary DNAs (cDNAs) were amplified, TCR target amplification was done using a TCR Amplification Kit (PN-1000252). The TCR and gene expression cDNA libraries were constructed using the Library Construction Kit (PN-1000190). The cDNA quality was assessed using an Agilent 2100 Bioanalyzer system (Agilent, Santa Clara, California, USA) and sequenced using an HiSeq System (Illumina, San Diego, California, USA) with a pair-end 150 bp sequencing strategy.

Preprocessing of paired scRNA-seq and scTCR-seq data

Raw sequencing data for RNA expression and VDJ from human CD8+ T scRNA-seq were processed using Cell Ranger software (V.6.0.1; 10x Genomics). RNA expression data were aligned to the GRCh38 reference genome and VDJ sequencing data to the GRCh38 VDJ reference pre-built by 10x Genomics (refdata-gex- GRCh38-2020-A) (Zheng et al, 2017). Gene expression count matrices were imported into the R package Seurat ( (V.4.1.0) using R (V.4.1.3). Genes found to be expressed in <3 cells, as well as cells with <200 expressed genes, were excluded from the analysis. TCR genes were also removed from the count data. Cells filtered to retain those with ≤10% mitochondrial RNA content and with several unique molecular identifiers numbered between 200 and 5000. RNA expression data were normalized against total expression per cell and natural log transformed with a scale factor of 10,000.

After batch effect correction, we created an ‘integrated’ data assay for downstream analysis. We found a set of anchors using the FindIntegrationAnchors function and used these to integrate the three data sets. Counts were log-normalized, scaled, and centered. The 2000 most-variable features were calculated with variance-stabilizing transformation and used for principal component analysis. Clustering was performed with Seurat: FindClusters with the resolution set to 0.5. Dimension reduction was performed with Uniform Manifold Approximation and Projection (UMAP). To analyze the antigen-specific CD8+ cells, 140 single cells expressing in vitro-confirmed antigen-specific TCRs (n=9) were normalized and re-clustered with a resolution of 0.5. For pseudotime analysis, the Seurat object was converted to a CellDataSet object and Monocle 3 was used to infer and build the developmental trajectory using naïve T cells as the root cluster.

Analysis of the scTCR-seq repertoire

The Shannon Index of diversity was calculated with the R package scRepertoire (V.1.3.5) ( For the publicly-available TCR data analysis, CDR3β and VDJ sequences from public TCRs specific for common viruses (cytomegalovirus (CMV), Epstein-Barr virus (EBV) and influenza A) were downloaded from VDJdb (, accessed 2-23-2022). We defined TCRs within the TIL TCRs that matched the TCRβ CDR3 and the VDJ regions of these common virus-specific TCRs as public TCR-expressing cells.

Differential expression analysis, Gene Set Enrichment Analysis and single sample GSEA

Genes differentially expressed between virus-specific T cells (n=19) and tumor-specific (n=140) or CTA-specific T cells (n=18) versus neoantigen-specific (n=122) were compared. Gene Set Enrichment Analysis (GSEA) (V.4.0.3) was performed to evaluate T-cell signaling, proliferation and cytokine production. The individual RNA expression score for single sample GSEA (ssGSEA) was calculated using the gene set variation analysis program in R package V.4.0.1. After z-score normalization of the ssGSEA score for each gene set, we calculated the mean values of z-scores of gene sets.

Generation of autologous B-cell antigen presenting cells

Autologous B-cell antigen presenting cells (B-APCs) were generated from peripheral blood mononuclear cells (PBMCs) of each patient. γ-irradiated (96 Gy) human CD40L-transfected NIH3T3 cells21 (kindly provided by Dr G Freeman, Dana-Farber Cancer Institute, Boston, Massachusetts, USA) were plated in 6-well plates and cultured overnight at 37 °C in 5% CO2. PBMCs were cultured at 4–6×106 cells/well on B-APCs in the presence of interleukin (IL)-4 (4 ng/mL; PeproTech, Cranbury, New Jersey, USA).

Screening of TCR antigen specificity using transcriptionally active PCR (TAP) fragments

eBlocks genes (integrated DNA technologies (IDT)) encoding variable regions of the TCRα and β chains were amplified by PCR. The amplified DNAs were then assembled into linearized plasmid vectors containing a constant region of a TCRα or TCRβ chain by the Gibson Assembly method, as previously reported.22 TAP fragments of TCRα and TCRβ that contained an EF-1α promoter and TK poly-A signals together with the pGL4.30 (luc2P/NFAT-RE/Hygro) vector (Promega, E8481) were transfected into the ΔTCRβ Jurkat cell line J.RT3-T3.5 (ATCC TIB-153) expressing CD8α (CD8-J2)23 by electroporation (Neon Electroporation System, Thermo Fisher) (Jurkat-Luc-TCR). 5×104 Jurkat-Luc-TCR cells were cultured overnight in the presence of autologous B-APC (5×104) with or without antigenic peptides. Activation of the NFAT reporter gene was measured by the Steady-Glo Luciferase Assay System (Promega).

Generation of Jurkat cells stably expressing TCRs (stable Jurkat-TCR cells)

To express TCRs stably, the synthesized TCRβ chain gene encoding a VDJ region (eBlocks), codon-optimized gene encoding the TCRβ constant region conjugated with the self-cleaving P2A peptide, the synthesized TCRα chain gene encoding a VJ region (eBlocks), and the codon-optimized TCRα constant region gene were assembled together in a linearized pMXs-IRES-puro retroviral vector using the NEBuilder HiFi DNA Assembly Master Mix (New England Biolabs, E2621). The constructed plasmid vector, pMXs-TCRβ-P2A-TCRα-IRES-puro, was used for retrovirus production. We used a retrovirus-packaging cell line (Phoenix GP-GaLV)24 to produce a retrovirus for transducing TCR genes. The Jurkat cells (CD8-J2) retrovirally transduced with TCRs were cultured in the presence of puromycin (1 μg/mL) for selection (stable Jurkat-TCR cells).

Specificity of stable Jurkat-TCR cells determined using HEK293T cells expressing HLA and antigen

The full-length cDNA of HLA-A*24:02 was obtained from an HLA-A*24:02-positive healthy donor and cloned into pcDNA3.1(+) vector (Invitrogen, California, USA). pcDNA3.1(+)/HLA vectors (A*26:03, B*15:01, B*40:01, B*51:01, C*07:02 and C*14:02) were purchased from RIKEN BRC (Tsukuba, Japan).25 gBlocks (IDT) encoding partial mutant SORL1 pS385F, JAGN1 pK34R, AKT2 pR371P, ITGB5 pR445W and their corresponding partial wild-type proteins (online supplemental table S1) were also cloned into the pcDNA3.1(+) vector. HEK293T cells were transfected with a total of 2.5 µg of HLA and antigen plasmids (1:1 ratio) using Lipofectamine 2000 (Thermo Fisher Scientific) and incubated at 37°C overnight in 6-well plates. Post transfection, HEK293T cells (2×104) were plated with stable Jurkat-TCR cells (5×104) transduced with the pGL4.30 (luc2P/NFAT-RE/Hygro) vector. After overnight incubation, activation of the reporter gene driven by the NFAT-response element was measured by the Steady-Glo Luciferase Assay System according to the manufacturer’s instructions (Promega).

Supplemental material

Statistical analysis

Comparison of results was by an unpaired, two-tailed Student’s t-test using GraphPad Prism V.5 (GraphPad Software). Correlations were calculated using Pearson’s correlation. A value of p<0.05 was considered statistically significant.


Patient characteristics

In this study, we focused on NSCLC with a high immune score and positive for expression of a CTA gene, KK-LC-1 in our earlier lung cancer cohort.20 We could use the information and reagents (eg, KK-LC-1-specific TCRα and β chain genes) for controls in this study from Fukuyama and coworkers.26 27 The characteristics of the three patients investigated here are summarized in table 1. Two had squamous cell carcinoma and one adenocarcinoma. There were 127, 24 and 357 somatic missense mutations in the tumors of patients 1, 2 and 3, respectively. Patients 1 and 2 both harbored HLA-B15:01, which is known to present the KK-LC-1-derived epitope KK-LC-176-84 (RQKRILVNL).26 Representative immune-related gene expression profiles from bulk RNA sequencing data (online supplemental figure S1), immunohistochemical analysis of immune cells (CD3+ and CD8+ T cells) and immune checkpoint molecules (PD-1 and PD-L1) confirmed our previously reported immune score data20 (online supplemental figure S2). Flow cytometric analysis of fresh tumor digests confirmed the infiltration of CD8+ T cells expressing PD-1 or CD39 and CD10316 into the tumor (online supplemental figure S3).

Supplemental material

Table 1

Patients’ characteristics

Phenotypic clusters of CD8+ TILs according to scRNA-seq

The overall approach to identify tumor-specific CD8+ TILs and their cognate antigens is shown schematically in figure 1A. We performed high-throughput CD8+ scRNA-seq and scTCR-seq to characterize the phenotype and clonality of CD8+ T cells. A total of 6998 CD8+ T cells was classified by the UMAP dimensionality reduction algorithm into 10 clusters based on the expression of T cell-related genes (figure 1B–D). These clusters were consistent with the expected main T-cell subsets effector memory T (Tem) and memory T (Tm), activated T (Tact), exhausted T (Tex) and naive T (Tn) cells, and certain less well represented phenotypes such as proliferating T (Tprol) cells, apoptotic T cells, natural killer (NK)-like cells, and γδ-like T cells. Pseudotime trajectory analysis documented that the differentiation of CD8+ T cells followed a characteristic trajectory from naive T cells to exhausted T cells (figure 1E). Hierarchical clustering analysis divided them into ‘exhausted’ phenotypes (Tprol, and Tex) and ‘non-exhausted or naive’ phenotypes (Tm, Teff and Tn) (figure 1F). Further analysis using markers which have been previously reported to be informative revealed that the Tex cluster was characterized by the expression of ENTPD1 (CD39), along with the expression of T-cell exhaustion marker genes such as PDCD1 (PD-1) and HAVCR2 (TIM3), effector molecule genes such as GZMB, GZMK, PRF1 and IFNG, and the transcription factor, TOX17 (figure 1G). In the Tex cluster, CXCL13, CTLA4, CXCR6, KLRB1 (CD161), and KRT86 were included in the top five most differentially expressed genes (figure 1H).

Figure 1

Phenotypic clustering of CD8+ TILs by single-cell RNA sequencing. (A) Overall scheme of this study. (B) The Uniform Manifold Approximation and Projection (UMAP) of the expression profiles of the 6998 single CD8+ T cells derived from the three surgical lung tumor tissues. CD8+ T cells are classified into 10 distinct transcriptional clusters. (C) UMAPs of CD8+ T cells in each patient. (D) Pie charts showing the proportions of CD8+ T cells in each cluster for all three samples and each patient separately. (E) Pseudotime trajectory analysis of 6998 CD8+ T cells. Each dot represents one single cell and each cell with a pseudotime score from dark blue to yellow, indicating early and terminal states, respectively. (F) The normalized average expression of phenotypic and functional signatures for CD8+ T cells subpopulations defined in B. (G) Violin plots quantifying relative transcriptional expression of genes (rows) with high differential expression among each cluster (columns). (H) Relative expression of the top five most differentially expressed genes in each cluster. NK, natural killer; scRNA-seq, single cell RNA sequencing; scCR-seq, single cell TCR sequencing; Tact, activated T; TCR, T-cell receptor; Tem, effector memory T; Tex, exhausted T; Tm, memory T; Tn, naive T; Tprol, proliferating T; Teff, effector like T; Tgd, γδ-like T; Tapop, apoptotic T; WES, whole exome sequencing.

TCR repertoire analysis in each of the CD8+ T-cell clusters

Next, we investigated the TCR repertoires of cells in these 10 clusters. A total of 2480 TCR clonotypes from the three patients was used for complementarity-determining region 3β (CDR3β) TCR sequencing analysis. The numbers of different clonotypes in all three patients and in each patient separately were highest in the Tm and Tn clusters (figure 2A upper and online supplemental tables S2–S4). The fraction of T cells in each cluster that expressed a distinct CDR3β (number of clonotypes/number of clones) was lower in Tex (figure 2A lower), suggesting a higher degree of oligoclonality in this cluster. The Shannon Diversity Index also showed a lower TCR diversity in cells of the Tex cluster (figure 2B). Interestingly, the TCRs in cells of the Tex cluster overlapped to a great extent with those in Tprol (figure 2C).

Figure 2

TCR repertoire analysis in each CD8+ T-cell cluster. (A) The number of unique clonotypes (upper) and the fraction of T cells that express distinct complementarity-determining region 3β (number of clonotypes/number of clones) (lower) in each cluster in all patients (left) and in each patient (right). (B) Shannon Diversity Indexes in 10 clusters. (C) The rate of overlapping clonotypes among different clusters. Tact, activated T; TCR, T-cell receptor; Tem, effector memory T; Tex, exhausted T; Tm, memory T; Tn, naive T; Tprol, proliferating T; Teff, effector like T; Tgd, γδ-like T; Tapop, apoptotic T.

Identification of tumor antigens recognized by CD8+ T cells

To investigate tumor-specific CD8+ T-cell populations, we selected TCR clonotypes represented at least twice in descending order of frequency in each cluster in patient 1, resulting in the evaluation of a total of 70 TCR clonotypes including 3 Tem-TCRs, 3 Tact-TCRs, 43 Tm-TCRs, 1 Tn-TCR, 1 γδ-like T-TCR, 1 NK-like-TCR and 18 Tex-TCRs (online supplemental table S2). To test for recognition of the target antigens, we synthesized 41 predicted neoantigen peptides according to the missense mutations as well as 14 KK-LC-1 peptides including the known HLA-B*15:01-restricted epitope (KK-LC-176-84: RQKRILVNL) (online supplemental table S5). Seventy TCR TAP fragment/luciferase-transduced Jurkat (TCR TAP/luc-Jurkat) cells were then co-cultured with autologous B-APC pulsed with 11 pooled peptides. Specific responses were recorded in only four Tex-TCR TAP/luc-Jurkat cells (figure 3A). We identified two different Tex-TCRs (ex06 and ex09) recognizing the same pooled peptides (pools 6 and 7) which were found to be specific for the same neoantigen (SORL1 pS385F) (figure 3B, left-hand side). Another two different Tex-TCRs (ex08 and ex15) recognizing the same pooled peptides (pool 9) were found to respond to the same HLA-B*15:01-restricted KK-LC-1 peptide (KK-LC-176-84: RQKRILVNL) (figure 3B right-hand side, and online supplemental figure S4A). No TCR clonotypes from any other clusters mediated a response to any of the 41 predicted neoantigens or the 14 KK-LC-1 peptides.

Figure 3

Identification of tumor antigens recognized by CD8+ T cells. (A) The specificity of 70 TCRs from all 10 clusters for 41 predicted neoantigen peptides and 14 KK-LC-1 peptides tested by screening patient 1. All TCR clonotypes with more than one clone in each cluster were selected for testing. TCR TAP fragment/luciferase-transduced Jurkat (TCR TAP/luc-Jurkat) cells were co-cultured with autologous B-cell antigen presenting cells and pooled antigenic peptides (5–8 peptides/pool). Activation of TCR signaling was assessed by luciferase reporter assay driven by the NFAT-response element. (B) Individual peptides in positive pools 6, 7 and 9 were subsequently tested separately by further luciferase reporter assay. (C) In patient 2, the top 10 TCRs in descending order of frequency plus 5 TCR with one clone from the Tex cluster were tested against 14 predicted neoantigen peptides and 26 KK-LC-1 peptides. (D) The individual peptides in the positive pools were subsequently tested separately. (E) In patient 3, the top 15 TCRs from the Tex cluster were tested against 30 predicted neoantigen peptides. (F) The individual peptides in the positive pools were subsequently tested separately. (G) All identified TCR clones (n=140) with the nine tumor-antigen specific TCRs in three patients were projected onto UMAPs. UMAP, Uniform Manifold Approximation and Projection; TAP, Transcriptionally Active Polymerase Chain Reaction; TCR, T-cell receptor; Tex, exhausted T; RLU, relative light unit.

Given that all T cells specific for tumor antigens were found within the Tex cluster in patient 1, in subsequent analyses for patients 2 and 3 we focused on the same cluster. In patient 2, we examined TCR reactivity to 14 neoantigen and 26 KK-LC-1 peptides (online supplemental table S5) using the top-10 TCR clonotypes represented at least twice in descending order of frequency plus 5 TCR clonotypes represented once in the Tex cluster (online supplemental table S3). Three Tex-TCRs (ex08, ex09 and ex12) mediated responses in this case (figure 3C). Ex09 recognized the same HLA-B*15:01-restricted KK-LC-1 peptide (KK-LC-176-84: RQKRILVNL), while ex08 recognized a novel HLA-C*14:02-restricted peptide (KK-LC-123-31: RFQRNTGEM) although it was originally predicted to be HLA-A*24:02 restricted (online supplemental table S5 and online supplemental figure S4A, B). Ex12 recognized a neoantigen (JAGN1 pK34R) (figure 3D). For patient 3, we examined reactivity of the top-15 TCR clonotypes in descending order of frequency in the Tex cluster (online supplemental table S4) to 30 neoantigen peptides (online supplemental table S5). Two Tex-TCRs (ex02 and ex11) mediated positive responses (figure 3E). Ex02 and ex11 recognized the neoantigens AKT2 pR371P and ITGB5 pR445W, respectively (figure 3F). We confirmed that all neoantigen peptides were naturally processed (online supplemental figure S5). Collectively, these data document nine different TCR clonotypes and five cognate tumor antigens from three patients (online supplemental table S6). All identified TCR clones (n=140) were projected onto UMAPs (figure 3G). Identical TCR clonotypes in the Tex cluster were also present in the Tprol, Tem, Tact, Tapop (apoptotic T), and NK-like cell clusters.

Gene expression analysis of CD8+ T cells responding against neoantigens, CTA or viral antigens

To characterize the features of tumor antigen-specific T cells, we first compared them with virus-specific T cells which infiltrate the tumor as bystander T cells.15 For this, we accessed virus-specific T-cell clonotypes in the public domain (online supplemental table S7). As previously reported,17–19 most such T cells localized to the various non-exhausted T-cell clusters (figure 4A). We compared gene expression by tumor antigen-specific T cells (n=140) with virus-specific T cells (n=19) by differential expression analysis. As expected, the exhausted T cell-related genes were dominantly expressed in tumor antigen-specific T cells (figure 4B).

Figure 4

Gene expression analysis of CD8+ T cells against neoantigens, CTA or viral antigens. (A) UMAPs of the distribution of virus antigen-specific TCRs from three cases. Data on complementarity-determining region 3β and VDJ from TCRs specific for common viruses (CMV, EBV and influenza A) were downloaded from VDJdb (, accessed 12/11/2020). A total of 19 TCRs specific for viruses (online supplemental table S7) were back projected onto UMAPs. (B) Analysis of genes differentially expressed between tumor antigen-specific and virus-specific T cells. (C) UMAPs of the distribution of KK-LC-1 and neoantigen specific TCRs from three cases. (D) Analysis of genes differentially expressed between KK-LC-1- and neoantigen-specific T cells. (E) Gene Set Enrichment Analysis of T cells specific for KK-LC-1 or neoantigens. CTA, cancer/testis antigens; UMAP, Uniform Manifold Approximation and Projection; TCR, T-cell receptor; CMV, cytomegalovirus; EBV, Epstein-Barr virus.

We next compared gene expression profiles between T cells responding against different tumor antigens (CTA KK-LC-1 or neoantigens) (figure 4C), but found no differentially expressed genes (figure 4D). However, GSEA using gene sets in MSigDB V.2022.1.HS, revealed that gene sets related to T-cell signaling, activation, proliferation and cytokine production were enriched in the T cells reactive against neoantigens but not in those reactive to the CTA (figure 4E and online supplemental table S8).

Differentiation and functional states of tumor antigen-specific CD8+ T cells

To further investigate these minor differences between tumor antigen-specific T cells, we performed re-clustering of 140 T-cell clones now divided into two major clusters, ‘effector-like’ and ‘progenitor-like’, represented by the expression of GZMA and PRF1, and the expression of CCR7 and IL-7R, respectively (figure 5A). The ratio of ‘progenitor-like’ cells in CTA specific T cells is higher than that in neoantigen specific. Recent reports indicated a linear differentiation pathway of virus-specific T cells from PD-1+TCF1+ ‘progenitor exhausted’, CX3CR1+PD-1+Tim-3+ ‘intermediate or transitory’, to CD101+PD-1+Tim-3+ ‘terminally differentiated exhausted’ T cells in a mouse chronic infection model.28 In the TILs of human cancers, two pre-exhausted T-cell phenotypes have been described (GZMK+ T cells and ZNF683+CXCR6+ T cells) leading to terminally exhausted T cells.29 30 Melanoma antigen-specific T cells also included proliferating cells expressing MKI67.17 Thus, to clarify the differentiation status of each tumor antigen-specific T cell, we investigated the expression of these genes by the 140 re-clustered T cells (figure 5B) as well as in all 6998 CD8+ T cells examined (online supplemental figure S6). Hierarchical clustering showed that the expression of each differentiation marker gene TCF7, GZMK, CX3CR1, ZNF683, CD101 and MKI67 was mutually exclusive, despite some overlap in individual T-cell clones (figure 5C). This suggests that individual T-cell clones possessed different differentiation and functional states even in the same Tex cluster.

Figure 5

Differentiation and functional states of tumor antigen-specific CD8+ T cells. (A) Re-clustering of 140 antigen-specific T cells and UMAPs showing two different phenotypic clusters. The distribution of T-cell receptor against different tumor antigens (CTA or neoantigen) in each cluster is shown. (B) UMAPs of T-cell subset-defining genes. (C) Hierarchical clustering showing the log-normalized expression of T-cell subset-defining genes in individual T-cell clones (n=140). (D) The expression of T-cell subset-defining genes in each tumor antigen-specific T cell. CTA, cancer/testis antigens; UMAP, Uniform Manifold Approximation and Projection.

According to the nature of the tumor-specific antigen recognized, some T-cell clones expressed the progenitor marker TCF7 (namely, the KK-LC-1-specific, mutant SORL1-specific, mutant JAGN1-specific and mutant AKT2-specific T cells), but some did not (mutant ITGB5-specific T cells) (figure 5D). Regarding the intermediate or pre-exhausted T-cell marker CX3CR1, this was expressed by KK-LC-1-specific and mutant SORL1-specific T cells, but not by mutant AKT2-specific or ITGB5-specific T cells. Instead, GZMK, ZNF683 and MKI67 were highly expressed in some mutant AKT2-specific T cells. T-cell clones with the terminal exhaustion marker CD101 were frequently observed within the population of mutant ITGB5-specific T cells. Taken together, these results suggest that differentiation states were different for each of the tumor antigen-specific T cells.

TCR signal strengths and their relation to T-cell differentiation and function

To clarify the relevance of TCR signal strengths for T-cell differentiation and function, we assessed the avidity of each TCR clonotype for the respective tumor antigen using an NFAT-reporter luciferase assay. For this, full length TCRα and TCRβ sequences were cloned and retrovirally transduced into Jurkat cells (stable Jurkat-TCR cells). We also cloned high affinity TCRs for CMV pp6524 as a control. The logEC50 values for nine Jurkat-TCR cells against five antigens ranged from 1.976 to 4.175 (figure 6A), compared with 0.9782 for the CMV pp65 antigen. The TCR signal strength of each TCR clonotype was then represented as the reciprocal of the logEC50 value (1/logEC50) (figure 6B). The signaling strength in Jurkat-TCR cells defined in this way on challenge with KK-LC-1 tended to be lower than with the neoantigens, but this did not reach statistical significance (p=0.067). Interestingly, among TCRs for the different tumor antigens, those specific for mutant ITGB5 exhibited the highest functional response (figure 6A,B). It is known that clones with higher-affinity TCRs are biased towards a terminally exhausted T-cell fate. Thus, mutant ITGB5-specific T cells might be closest to terminal exhaustion, as evidenced also by CD101 expression (figure 5E).

Figure 6

TCR signal strengths and their relation to T-cell differentiation and function. (A) Assessment of the functional avidity of nine Jurkat-TCR cells. The cells were co-cultured with autologous B-cell antigen presenting cells pulsed with different concentrations of antigenic peptides. Jurkat-TCR cells specific for CMV pp6524 were used as a high affinity TCR T-cell control. (B) Summary of the 1/logEC50 values for nine tumor antigen-specific T cells. (C) Correlations between TCR signal strengths (1/logEC50) and the mean value of z-scores for each tumor antigen-specific TCR clonotype for GOBP gene signatures related to cell signaling, cytotoxicity, proliferation, response to cytokine and cytokine production. (D) The z-scores of each clone in GOBP gene sets in C plotted according to the tumor antigen-specificity. TCR, T-cell receptor; CMV, cytomegalovirus: GOBP, Gene Ontology Biological Process.

Finally, we examined the relevance of TCR signaling in vitro and T-cell features in vivo for cells with the nine TCR clonotypes. The TCR signal strengths were found to be moderately correlated with the mean values of z-scores in each TCR clonotype for the negative wingless/integrase-1 (WNT) signaling pathway (Pearson’s correlation (r)=0.671, p=0.048), granzyme-mediated programmed cell death signaling (r=0.651, p=0.057), T-cell proliferation (r=0.639, p=0.064), response to interferon gamma (r=0.611, p=0.080), and tumor necrosis factor superfamily cytokine production (r=0.608, p=0.082) (figure 6C and online supplemental table S9). For these gene signatures, the z-scores of the neoantigen-specific T cells were variable but significantly higher than in the case of KK-LC-1-reactivity (figure 6D), consistent with the GSEA data comparing KK-LC-1- and neoantigen-specific T cells (figure 4E).


In this study, we performed a single-cell sequencing analysis to evaluate the phenotype of TILs in surgically resected tumors from patients with lung cancer. To identify tumor-specific CD8+ T cells and their cognate tumor antigens, we selected patients whose tumors exhibited high immune scores and expressed the CTA gene KK-LC-1. We found that it was predominantly CD8+ T cells with an exhausted phenotype in the TILs that recognized several tumor antigens including neoantigens and KK-LC-1 epitopes.

The advent of single-cell sequencing analysis has allowed phenotypic characterization and the tumor specificity of CD8+ TILs in melanoma to be profiled at high resolution.17 That study found two major distinct phenotypic patterns; the predominant phenotype was either ‘non-exhausted memory’ or ‘exhausted’ CD8+ T cells. Melanoma tumor antigens, including shared and unique antigens, were mainly recognized by CD8+ T cells with an exhausted phenotype, and only rarely by those with memory properties. In other single-cell analyses of CD4+ and CD8+ TILs from metastatic human cancers, neoantigen-reactive T cells exhibited dysfunctional differentiated phenotypes similar to exhausted T cells in both CD4+ and CD8+ subsets.19 In addition, neoantigen-specific TILs in neoadjuvant anti-PD-1-treated lung cancers were characterized by hallmark transcriptional programs of tissue-resident memory cells, expressing high levels of ZNF683 (HOBIT) and ITGAE (CD103) and a low level of IL-7R.18 In our study analyzing resected lung cancers from patients without immunotherapy, we also found similar characteristics of antigen-specific CD8+ T cells (Tex cluster), that exhibited exhausted and resident memory phenotypes.

It has been previously reported that patients with melanoma with higher frequencies of intratumoral TPE cells (T progenitor exhausted; TCF1+PD-1+CD8+ T cells) experienced a durable response to ICI.31 It is considered that effective antitumor immune responses elicited by ICI may depend on TPE cells in the tumor the function of which is restored by the therapy.31 Moreover, a reinvigorated CD39-negative stem-like phenotype expanded from less-exhausted tumor-specific TILs by ex vivo activation was associated with response to adoptive T-cell therapy and long-term persistence.32 Therefore, the presence of non-exhausted T cells with a restorable functionality in the tumor may be required to mediate tumor control. However, we have shown that such progenitors characterized by a TCF1+PD-1+CD8+ T-cell phenotype constituted only a minor population of tumor antigen-specific T cells. Further studies focused on analyzing sequential tissue specimens before and after immunotherapies such as ICI are needed to characterize these phenotypes and their function and dynamics within the tumor. Given that most antigen-specific T cells at the tumor site are transitory or terminally differentiated exhausted populations which are not likely to be restored by ICI, generating de novo non-exhausted T cells may be important to achieve a productive antitumor response. Cancer vaccines using Tex-TCR-recognized antigens to expand the memory T cells outside tumors, or adoptive T-cell transfer of Tex-TCR-transduced T cells are expected to become promising antigen-based immunotherapeutic agents.

Recent progress in next-generation sequencing has demonstrated that tumor mutational burden correlates with prognosis and antitumor responses to ICI in some cancer indications.33 Thus, mutation-derived neoantigens are considered to be promising targets of ICI-reinvigorated or ICI-de novo activated T cells.34 35 On the other hand, it has also been reported that CTA (eg, KK-LC-1)-specific T cells are a dominant T-cell subset in TILs of patient with human papillomavirus (HPV)-positive cervical cancer who achieved a complete response on adoptive TIL therapy.36 KK-LC-1, originally identified from lung cancer,26 37 is widely expressed by many types of cancer.38 39 In our study, KK-LC-1-specific T cells were present in the same exhausted cell cluster as neoantigen-specific T cells in two patients, and had similar gene expression profiles. However, there were some differences in terms of TCR signal strengths in vitro, cell proliferation and cytokine production, which were dominated by neoantigen-specific T cells. In general, T cells with a high-avidity for mutation-derived neoantigens are commonly found, but this is not the case for tumor-associated antigens such as CTA.40 Indeed, TCR avidities for neoantigens are relatively higher than for melanoma-associated antigens such as MAGE, PMEL or TYR in patients with melanoma.17 Similarly, we found that TCR avidities for neoantigens are variable but always higher than for KK-LC-1 in our patients with lung cancer.

Re-clustering of antigen-specific T-cell clones in this study demonstrated that differentiation states differed for each of tumor antigen-specific T cells. Some antigen-specific T cells (eg, mutant ITGB5-specific T cells) were skewed to terminally exhausted T cells that are not likely to be enhanced by ICI. Accordingly, examining which tumor-specific T cells and their cognate antigens become promising targets for ICI and/or antigen-based immunotherapies will be necessary. Burger et al showed that antigen dominance hierarchies shape T-cell phenotypes in a mouse lung cancer model persistently expressing two neoantigens. Thus, dominant antigens induce T cells with a TCF1+ progenitor phenotype that correlate with the response to ICI, while subdominant antigens activate CCR6+TCF1+ T cells that are not enhanced by ICI but can be functionally improved by cancer vaccination.41 Human cancers express variable numbers of antigens present at different levels and with a different clonal distribution on the component cells of the tumor. This is likely to influence hierarchical ordering. Therefore, these insights warrant further investigation regarding those lung cancer antigens shaping different T-cell phenotypes, which ultimately affect therapeutic efficacy. This could lead to a new strategy for patient selection for immunotherapy42 43 as well as for combination with cancer vaccines to optimally engage concurrent T-cell responses against different antigens.

There are several limitations to the present study. We analyzed only three surgically resected lung tumors from three different patients, focused only on exhausted T cells, and evaluated tumor antigens only from missense mutation-derived neoantigens, and also only one CTA. Due to the restricted analysis focusing on exhausted T cells, there might also be a bias in the differentiation and functional states for each of antigen-specific T cells. Despite these limitations, we demonstrated that tumor-specific T-cell populations defined by single-cell analysis recognized several tumor antigens from patients with lung cancer. Further studies will be needed to elucidate more precise phenotypes and functions of these tumor-specific T cells, and their responses to broader landscapes of tumor antigens11 44–46 in order to identify promising therapeutic targets for antigen-based immunotherapy.

Data availability statement

Data are available upon reasonable request.

Ethics statements

Patient consent for publication

Ethics approval

This study involves human participants and was approved by Ethics Committee(s) or Institutional Board(s): Aichi Cancer Center Ethics Committee, reference number or ID: 2018-2-20; The Ethics Committee of Gifu University Graduate School of Medicine, reference number or ID: 2019-075. Participants gave informed consent to participate in the study before taking part.


We are grateful to all the patients who participated in this study. We acknowledge the excellent technical assistance of Seiko Shibata, Yoshiko Suzuki, Naoko Takeda and Masami Iwano for the preparation of tumor tissue specimens or plasmid constructs, and Yasuko Fujihara and Yuki Abe for assisting with the analysis of patient information. The super-computing resource was provided by the Human Genome Center, the Institute of Medical Science, the University of Tokyo.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 4.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
  46. 46.

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • HKo and SS contributed equally.

  • Contributors HM and HI conducted the study, accepted full responsibility for the study, had access to all the data and the final decision to submit for publication. HKo, SS, TF, HKi, YY, KD, HI and HM designed the project. HKo, SS, YF, AD-O, KM, TM, YS, RN, CT and HH performed the experiments. HKo, SS and RY performed bioinformatics analyses. YT, TO, YS, FW, KA and HKu provided human clinical samples. DMi, YT, KOnoue, KOnoguchi, YY, RS and TC predicted tumor antigens by artificial intelligence. HKo, SS, DMu and HM analyzed the results and wrote the manuscript. All authors reviewed the manuscript.

  • Funding This work was supported by the Aichi Cancer Center Joint Research Project on Priority Areas (HM) and was also supported in part by the Japan Society for the Promotion of Science KAKENHI grant numbers 20K16380 (HKo), 22K20810 (SS), 20K09187 (YT), 21K19939 (RY), 19H03528 (HM) and 22H02934 (HM), and the Japanese Respiratory Foundation (SS). This work was also supported in part by research grants from the Uehara Memorial Foundation (RY).

  • Competing interests This study was partly funded by NEC Corporation.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.