Despite promising clinical results in a small subset of malignancies, therapies based on engineered chimeric antigen receptor and T-cell receptor T cells are associated with serious adverse events, including cytokine release syndrome and neurotoxicity. These toxicities are sometimes so severe that they significantly hinder the implementation of this therapeutic strategy. For a long time, existing preclinical models failed to predict severe toxicities seen in human clinical trials after engineered T-cell infusion. However, in recent years, there has been a concerted effort to develop models, including humanized mouse models, which can better recapitulate toxicities observed in patients. The Accelerating Development and Improving Access to CAR and TCR-engineered T cell therapy (T2EVOLVE) consortium is a public–private partnership directed at accelerating the preclinical development and increasing access to engineered T-cell therapy for patients with cancer. A key ambition in T2EVOLVE is to design new models and tools with higher predictive value for clinical safety and efficacy, in order to improve and accelerate the selection of lead T-cell products for clinical translation. Herein, we review existing preclinical models that are used to test the safety of engineered T cells. We will also highlight limitations of these models and propose potential measures to improve them.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Adoptive T-cell therapy, which relies on the infusion of tumor-reactive T cells that can recognize and kill malignant cells, has demonstrated remarkable efficacy in several advanced-stage cancers. This therapy requires primary human T cells to be genetically modified to express tumor-specific receptors that consist of either a T-cell receptor (TCR) or a chimeric antigen receptor (CAR). TCRs are heterodimeric glycoproteins composed of TCR-α and β chains associated with the CD3 complex, able to recognize target antigens in the context of a specific peptide–major histocompatibility complex (MHC). CARs, on the other hand, are synthetic receptors consisting of an MHC-independent antigen-binding moiety commonly derived from a tumor-specific monoclonal antibody, fused to an intracellular signaling region, mainly composed of the CD3ζ chain and costimulatory molecules derived from CD28 or 4-1BB, although other domains are currently being tested.1 Notwithstanding impressive clinical benefit in a small subset of malignancies, therapies based on engineered T cells are associated with potentially life-threatening toxicities. Importantly, preclinical models have mostly failed to predict these complications in humans, as they were primarily designed for testing efficacy at the time of the first toxicity observation in patients.
Here, we will review the main toxicities associated with engineered T-cell therapy and preclinical models currently used to study these adverse events. Recently, many efforts have been dedicated to the establishment of more predictive and reliable models. We will thus highlight the advantages, as well as the limitations, of current models and propose measures to have preclinical models fit for purpose with respect to engineered T-cell toxicity profiling.
Toxicities and preclinical models
Cytokine release syndrome (CRS) and neurotoxicity
One of the most common and potentially fatal immune-related adverse events of CD19 CAR T-cell therapy is CRS2–8 (figure 1). According to the American Society for Transplantation and Cellular Therapy(ASTCT) consensus grading system, CRS is described as an immune effector cell-associated supraphysiological response following any immune therapy, resulting in activation of endogenous or infused T cells, as well as other immune cells, that must include fever at the onset and may additionally include hypotension, capillary leak, and organ dysfunction.9 Recent studies have highlighted the key role of myeloid and endothelial cell activation in the propagation and worsening of the syndrome and have identified gasdermin E-mediated target cell pyroptosis as a primary trigger for macrophage activation.3 10 11 CRS is also the most common adverse event observed in patients with multiple myeloma (MM) receiving B cell maturation antigen(BCMA) CAR T cells.12 Patients receiving CAR T cells are closely monitored within the first 10 days after infusion for any sign of CRS (eg, fever >38°C). CRS management needs to follow a grading and risk-adapted approach. Low-grade CRS can be treated symptomatically (antipyretics and fluids), whereas patients developing CRS of grade 3 or 4 may be treated with vasopressors, tocilizumab (anti-interleukin (IL)-6 receptor antagonist), and/or low-dose, or if required, high-dose corticosteroids.13
Neurotoxicity, also known as immune effector cell-associated neurotoxicity syndrome (ICANS), has been reported in all CD19 CAR T clinical trials exhibiting a robust immune response,14 with more than 60% of patients experiencing toxic neurological effects (figure 1). While neurotoxicity has often been described to be associated with CRS,15 each toxicity can occur independently,16–18 with a grading now very well defined.9 ICANS is usually self-limiting but can necessitate admission to the intensive care unit and is rarely fatal.19 20 Clinical manifestations of neurotoxicity include confusion, language disturbance, fine motor skill deficits, encephalopathy, somnolence, dysphasia, aphasia, seizures, cerebral edema with coma, and death.16 18 21 Molecular mechanisms of ICANS include systemic inflammatory responses triggered by myeloid cells that activate endothelial cells and increase the permeability of the blood–brain barrier (BBB).16 Once the BBB becomes dysfunctional, the cerebrospinal fluid can be exposed to high concentrations of systemic cytokines and immune cells, which can result in brain vascular pericyte stress and secretion of endothelium-activating cytokines.22 Recently, CD19 CAR T cell-related ICANS has also been related to the recognition of CD19+ brain mural cells.23 ICANS has also been observed in patients treated with BCMA CAR T cells, even though its incidence appears to be more heterogenous among different clinical trials. As of now, most patients with MM experience mild and reversible ICANS, with no reported deaths due to this adverse event.12 The standard of care for neurotoxicity includes supportive care and corticosteroids to induce immunosuppression.16 Treatment of neurotoxicity may also include inhibition of IL-6 with or without corticosteroid administration,22 but this appears more effective for CRS.16 17 Additional treatment strategies for CRS and neurotoxicity include targeting granulocyte macrophage colony-stimulating factor (GM-CSF), IL-1, tumor necrosis factor alpha, JAK/STAT, ITK, T-cell activation switches, and endothelial cells.16
Notably, Tmunity Therapeutics has recently reported two deaths from neurotoxicity during a clinical trial testing Prostate-specific membrane antigen (PSMA)-targeting CAR T cells armored with a dominant negative transforming growth factor beta (TGF-β) receptor in prostate cancer. These events were associated with a unique cytokine profile and massive macrophage activation which did not respond to tocilizumab.24 Similarly, in a clinical trial in patients with melanoma, the administration of tumor-infiltrating lymphocytes armored with an inducible IL-12 gene mediated significant antitumor responses but was accompanied by severe IL-12-related CRS-like toxicity that limited further development of the approach.25 On one hand, these clinical observations reveal the need for additional mechanistic studies to inform the rational design of therapeutic interventions for solid tumors and, on the other, highlight the complexity that armoring of CAR T cells can add to the toxicity assessment.
Models for CRS and neurotoxicity
Biomarkers are biological characteristics that objectively measure and evaluate biological or pathogenic processes and/or indicators of pharmacological responses to a therapeutic intervention26 and are an essential component of preclinical safety assessment of CAR T cells (figure 2). In particular, the identification of predictive biomarkers may be crucial for the selection of patients at risk of developing severe toxicities who might benefit from early therapeutic intervention. The immunomonitoring of patients treated with CD19 CAR T cells includes serum biomarkers like MCP-1, SGP130, interferon gamma, IL-1, eotaxin, IL-13, IL-10, macrophage inflammatory protein-1 alpha,3 4 27 as well as IL-6, IL-15, and TGF-β28 29 as independent predictors in statistical models assessing risk of CRS and neurotoxicity, respectively.
Several animal models have been employed to predict CRS and ICANS (figure 2), starting with syngeneic mouse strains, comprising of intact immune cells and murine CAR T cells. These models have the advantage of recapitulating the complex crosstalk between CAR T cells and host immune cells.30 Allotransplantation studies of murine CAR T cells in mice with different degrees of immune deficiency were the first to suggest the requirement for a functional myeloid compartment to trigger CRS.30 CRS occurrence on infusion of human CAR T cells has not been observed in immunodeficient NSG (Non-Obese Diabetic Severe Combined Immunodeficient (NOD-SCID), gamma) mice but has been reported in SCID-beige mice, which feature a less compromised myeloid compartment. By using the SCID-beige model, it was possible to prove that this reaction is triggered by resident macrophages due to both contact-dependent and cytokine-related mechanisms, such as nitric oxide together with IL-1 and IL-6 release.31 Reconstitution of NSG mice with human hematopoietic stem/precursor cells (HSPCs) offers an alternative approach where human CAR T cells can interact with human myeloid cells and cytokines. However, the proportion of myeloid cells differentiating from human HSPCs in NSG mice rarely exceeds 5%–10% of human white blood cells.32 Therefore, a triple transgenic NSG mouse strain (SGM3) has been recently proposed to better support the reconstitution of a human hematopoietic system, including the myeloid compartment, due to the expression of human stem cell factor, GM-CSF, and IL-3.33 When HSPC-humanized SGM3 mice were employed, only monocyte–CAR T-cell interactions were found to recapitulate CRS, definitively confirming the primary role of myeloid-derived cells in releasing IL-1 and IL-6, both hallmark cytokines of CRS.10 In contrast to other models, humanized SGM3 mice were also able to recapitulate neurotoxic manifestations, which in this case cannot be ascribed to on-target, off-tumor reactions against mural cells but might be rather connected to CRS-related inflammatory reactions.10 Otherwise, having an immune system much more similar to humans, primates are excellent large animal models to better interrogate CAR T-cell toxicities but often require autologous CAR T cells and are deficient of tumor. Nevertheless, given the physiological similarities to humans, these models closely recapitulate CRS and ICANS development.34–36 Finally, biological similarities between canine and human cancer offer the possibility to test engineered T-cell strategies in dogs with naturally occurring tumors. In this regard, the Comparative Oncology Program of the NCI has established a network of 24 veterinary academic partners known as the Comparative Oncology Trials Consortium, which will support the implementation of cell-based trials in dogs for decision-making prior to clinical testing in humans. In return, information from human clinical trials can guide the development of cell-based therapies in veterinary oncology, under the so-called One Health initiative.37
In vitro models
In vitro coculture models that consist of monolayers of cells expressing the target antigen have been traditionally employed to test the specificity and efficacy of CAR T cells38 (figure 2). However, these models were not considered appropriate to predict adverse effects. More recently, other cells such as macrophages have also been included in cocultures of target cells and CAR T cells.11 39 40 Such models have facilitated mechanistic insights of CRS. Importantly, the measurement of biomarkers contained within supernatants from these cocultures can also inform about potential adverse events triggered by CAR T cells in vivo. In fact, recent data show that high levels of catecholamines found in cultures of human CD19 CAR T cells admixed with malignant B cells and macrophages correlated well with CRS seen in mice after CAR T-cell infusion. Accordingly, in patients with diffuse large B-cell lymphoma treated with CD19 CAR T cells, an association was observed between high levels of norepinephrine and severe CRS.40 Moreover, the rapid and massive death of target cells by pyroptosis, which is specifically triggered by CAR T cells, was found to activate macrophages to produce CRS-related cytokines.11
On-target and off-target, off-tumor toxicity (including cytopenias, B-cell depletion, and immune reconstitution)
Ideally, CAR T cells should selectively target malignant cells. However, target antigens are often expressed on both tumor cells and healthy tissues, raising concerns regarding on-target, off-tumor toxicity.41 The severity of toxic manifestations depends on how accessible, widespread, and vital the target tissue is. Reported events range from manageable lineage depletion, such as B-cell aplasia for CD19 CAR T cells,42 secondary hypogammaglobulinemia for BCMA CAR T cells,12 liver toxicity for CAR T cells targeting carboxyanhydrase-IX,43 to severe and fatal pulmonary toxicity for HER2 CAR T cells, possibly associated with recognition of low levels of ERBB2 on lung epithelial cells.44 With CD19 CAR T cells broadly used both in clinical trials and in the commercial setting for ALL and NHL, long-term B-cell depletion is the most commonly described on-target, off-tumor toxicity (figure 1). During normal B-cell development, CD19 is present from the pre-B-cell stage until the plasma cell stage. Long-term B-cell aplasia has been described in all the pivotal phase II CD19 CAR T-cell trials45–48 and contributes to hypogammaglobulinemia, increases the risk of infection, and may have consequences for the response to vaccinations.49 It is generally managed with intravenous immunoglobulin supplementation in pediatric patients; in adult patients, this is common practice only in patients with recurrent bacterial infections. Lymphopenia, in particular CD4+ T-cell lymphopenia, can persist for >1 year.50
Cytopenias (especially neutropenia) persisting >30 days postinfusion are common off-target side effects (30%–40%), the pathogenesis of which is currently unclear. Factors contributing to prolonged cytopenia (>90 days, occurring in 10%–20% of patients) include low baseline cell counts, prior therapies including prior SCT, impaired hematopoietic reserve, bone marrow infiltration and chronic inflammation reflected by higher baseline ferritin and C reactive protein levels,51 and alterations in levels of the chemokine CXCL12 in the marrow microenvironment correlating with events of late neutropenia, likely associated with B-cell recovery.52 The bone marrow is usually hypocellular.
Alternatively, off-target off-tumor toxicity can occur due to cross-reactive binding to a mimotope, which is a similar but distinct epitope expressed on normal tissues. This cross-reactivity or ‘off-target’ binding to cell surface proteins is difficult to predict in preclinical animal studies and can lead to serious adverse effects in patients. Even though CAR T-cell therapies have yet to demonstrate off-target effects mediated by inappropriate scFv recognition of a non-target antigen, TCR-engineered T-cell therapies have revealed the possibility of TCR promiscuity resulting in the death of a patient.53
Models for on-target and off-target, off-tumor toxicity
Animal models for on-target, off-tumor toxicity
When the expression of the target antigen is similar between human and mouse, but the antihuman antibody does not recognize the murine orthologue, on-target, off-tumor toxicity against healthy tissues expressing the molecule of interest can only be addressed in syngeneic models (figure 2). For example, strategies to overcome B-cell aplasia induced by CD19 CAR T cells have been successfully investigated in syngeneic models.54 Similarly, it has been recently shown that the administration of murine CD19 CAR T cells to mice of different strains, including NSG, can cause BBB leakiness and pericyte depletion, supporting the hypothesis that ICANS development in patients could also be the result of on-target, off-tumor recognition of CD19 on brain mural cells.23 However, a significantly lower degree of CD19 expression was observed in mice compared with humans, highlighting that species-specific differences may limit neurotoxicity evaluation in mouse models. In another scenario, when the target antigen has a similar expression profile in humans and rodents and the antibody recognizes both human and murine orthologues, it is possible to profile on-target, off-tumor toxicity using human CAR T cells in immunodeficient mice. For example, high-affinity human GD2 CAR T cells induced fatal encephalitis in NOD-SCID-Il2rg−/− (NSG) mice, possibly due to low GD2 expression on the cerebellum and basal regions of the brain.55 Similarly, in recent studies, the authors took advantage of the cross-reactivity of B7-H3 monoclonal antibodies with murine B7-H3 to investigate the safety of B7-H3 CAR T cells or antibody–drug conjugates both in immunodeficient and immunocompetent tumor-bearing mice.56 57 Alternatively, on-target, off-tumor reactions can be studied in immunocompetent transgenic mice that possess an intact immune system and stably express a transgene encoding for a human tumor-associated antigen (TAA). Transgenic mice are generated by knocking out a murine TAA and knocking in the desired human one alongside its regulatory elements, mimicking the spatiotemporal expression patterns as seen in patients. These mice can further be bred with tumor-prone mice or directly grafted with TAA+ tumors to test new CAR T-cell therapies prior to their clinical application, as in the case of carcinoembryonic antigen(CEA) transgenic mice treated with CEA CAR T cells.58 59 In addition, HSPC-humanized mouse models, as described previously, are extremely useful for studying on-target, off-tumor reactions against the hematopoietic compartment, as in the case of CD123 and CD44v6 target antigens.60 61 Finally, primate or canine models are also suitable for studying on-target, off-tumor events, as species-specific differences in antigen expression between non-human primates and humans are limited.36 ,37
Assessment of target antigen expression
For T cells that have been engineered to specifically bind to a target antigen, a detailed and careful assessment of the expression pattern of the target antigen in normal cells and tissues has to be completed. Antibody-based immunohistochemistry and transcriptomic analysis have been mostly used for exploring target antigen expression in normal tissues (eg, see Lichtman et al62) (figure 2). Published repositories of mRNA and protein expression (eg, Human Protein Atlas) are also commonly employed to evaluate candidate targets in normal and tumor cells. Recently, proteomic and genomic datasets have been generated and integrated with bioinformatics tools to search for optimal CAR targets expression in acute myeloid leukemia and not in normal tissues.63
Models for off-target toxicity
To ensure patient safety, it is imperative to minimize the risk of initiating an inappropriate immune response due to unanticipated off-target binding of CAR T cells to cell surface proteins expressed on normal tissues. Recently developed cell microarray technologies provide an understanding of the off-target profile of CAR T cells and demonstrate on-target specificity.64 The Retrogenix Cell Microarray Technology identifies interactions with both cell surface receptors and secreted proteins by screening scFvs or whole CAR T cells for binding against >4000 full-length proteins that are individually overexpressed in their native context in human cells. This platform, established in human cells, coupled with broad protein coverage, allows even low-affinity interactions to be detected with a high degree of sensitivity and specificity and can provide insights into potential off-target toxicities or soluble sinks for the therapeutic. Furthermore, if off-targets are identified, cell lines or primary cell types endogenously expressing the off-target protein can be used as target cells to assess potential off-target cytotoxicity and CAR T-cell activation. This platform is increasingly being used for CAR T-cell development,65 and data from these studies have been included in regulatory submissions, including the biologics license application for Novartis’s Kymriah.66
Graft-versus-host disease (GVHD) and rejection associated with allogeneic engineered T cells
The use of allogeneic CAR T-cell products generated using cells from healthy donors has the potential to overcome many limitations associated with autologous products but come with their own challenges, including the potential to induce GVHD (figure 1), as well as the risk of immune-mediated rejection by the host. The risk of GVHD, which correlates with increasing donor–recipient Human leukocyte antigen(HLA) disparity, could be mitigated through several approaches, including donor selection, cell-type selection, T-cell depletion/selection and/or use of gene editing. Indeed, gene editing of the endogenous αβ TCR typically includes the disruption of T Cell Receptor Alpha Constant(TRAC) or T cell receptor beta constant(TRBC) locus and reduces the risk of GVHD linked to the TCR recognition of allogeneic host tissue.67 Moreover, the use of gene edited T cells deficient in expression of CD52 has also been explored in combination with alemtuzumab to maintain a prolonged conditioning regimen without affecting CAR T product persistence.68 69 Although this approach could theoretically control GVHD, some concerns have been raised about prolonged lymphodepletion regimens, during which viral reactivations can be problematic.70 Other worries regard the fate of a T cell in the absence of its own TCR.71 This is evident in the UCART19 approach, where preliminary clinical data72 show short UCART19 persistence. Notably, although TCR disruption has been developed, it should be underlined that clinical experience to date with allogeneic CAR, both virus-specific or from allogeneic transplant donors, has shown antitumor effects with minimal GVHD risk.73 74
Models for GVHD and rejection
MHC-disparate allogeneic mouse models can be employed to study GVHD (figure 2). For example, these models demonstrated that cumulative signaling through the exogeneous CAR and the endogenous alloreactive TCR results in a reduced risk of developing GVHD due to loss of function and possible deletion of transferred T cells.75 Similarly, xenoreactions in immunodeficient mice can be exploited as surrogate markers for the potential of human engineered T cells to cause GVHD in patients.76–78 In these models, GVHD scores have been defined and applied based on multiple parameters, such as progressive weight loss, excessive T-cell expansion, ruffled fur, hunchback, and T-cell infiltration of GVHD target organs. Importantly, standard xenograft models cannot be employed to evaluate the rejection potential, which instead can be assessed in allogeneic mouse models. More sophisticated alternatives can be found in the HSPC-humanized SGM3 models, described earlier, where the GVHD potential can be measured as reactivity against human hematopoietic cells developed from allogeneic CD34+ donors. In these models, rejection potential can be evaluated by assessing the time required to develop endogenous human T cells that should be able to mediate rejection of allogeneic engineered T-cell products.10 Alternatively, allogeneic immune responses can be studied in vitro with mixed lymphocyte reactions by coculturing allogeneic CAR T cells with peripheral blood mononuclear cells(PBMCs) from different donors. Furthermore, alloreactivity can also be assessed in vivo in immunodeficient NSG mice by coinfusion of allogeneic CAR T cells with PBMCs from HLA disparate donors, followed by evaluating engraftment of CAR T cells.79 80
Cross-reactivity (TCR-T cells)
TCR cross-reactivity is a major safety risk of TCR gene therapy (figure 1). TCRs recognize peptides presented by HLA class I and class II molecules. TCR binding involves interactions between the complementary-determining region (CDR) loops of the TCR and amino acid residues of HLA molecules and the HLA-presented peptides.81 The binding of TCRs can therefore be broken down into a peptide-independent HLA binding component and a peptide-specific component.81 Both components are required to achieve the appropriate binding affinity required for T-cell activation. Thus, cross-reactivity can be caused by two distinct mechanisms: (1) TCRs may cross-react with one of the numerous peptides that are presented by the HLA allele that is used by the TCR to recognize its cognate ligand, and (2) cross-reactivity with distinct HLA alleles presenting a library of peptides to which the TCR is not tolerant may occur. Therapeutic TCRs will only be tolerant to the HLA alleles that were present in the individual from whom the TCR was isolated but not to other HLA alleles that are present in patients treated with TCR gene therapy. In the clinic, rare, although in some cases severe, toxicities have been reported, mainly with artificially enhanced TCRs to date. In particular, TCRs can be modified by different methods, including affinity maturation of their CDRs in order to increase their affinity for the target antigen. Although effective, this approach overcomes the negative selection exerted by the thymus to delete autoreactive T cells and may increase cross-reactivity and recognition of non-target-specific peptides. Patients treated with an affinity-enhanced TCR specific for a MAGE-A3 epitope presented by HLA-A*01 suffered fatal off-target off-tumor cardiac toxicity due to cross-reactivity to titin,53 82 an event completely unpredicted by preclinical studies at the time. Although all TCRs have the potential for cross-reactivity, this risk is significantly heightened in the context of affinity matured TCRs and has not been observed with T cells engineered to express other therapeutic TCRs within the normal affinity range.
Models for cross reactivity
The two types of cross-reactivity as described previously need to be assessed using different strategies. HLA alleles that were not present in the TCR donor can stimulate strong T-cell responses. In fact, allogeneic HLA molecules are among the most immunogenic antigens, with 1%–10% of ‘naïve’ T cells responding to peptide-presenting allogeneic HLA molecules. It is therefore critical to exclude alloreactivity of therapeutic TCRs. This can be achieved by extensive in vitro screening with panels of cell lines expressing diverse HLA alleles, and testing of TCR-engineered T lymphocytes against cells from the patient before treatment (figure 2). The potential for cross-reactivity of alternative peptides presented by the HLA allele presenting the ‘cognate’ peptide requires careful analysis. In order to assess this risk, it is useful to define the fine specificity profile of the therapeutic TCR by changing individual residues in the cognate peptide and to measure which changes result in loss of T-cell activation. This analysis reveals peptide residues that are essential for TCR binding and T-cell activation and also residues that can be substituted without loss of TCR binding. The essential residues form a peptide motif that can be used to screen human exome databases and to identify human proteins that contain this motif. The corresponding peptides can be synthesized and tested for recognition by the therapeutic TCR. Peptide titration experiments are important to assess whether TCR recognition occurs only at unphysiological high peptide concentrations or also at more physiological low concentrations. Stimulation of TCR-engineered T cells with cells endogenously expressing the corresponding protein is essential to determine whether ‘natural’ antigen processing and presentation produces the cross-reactive peptide and leads to T-cell activation.
Administration of lymphodepleting regimens, commonly comprising cyclophosphamide and fludarabine, prior to adoptive T-cell transfer is a key step for the clinical success of engineered T-cell therapies.5 72 83–85 To achieve better T-cell engraftment, expansion and persistence, lymphodepletion likely works through multiple mechanisms: (1) decreased immunosuppressive environment,86–88 (2) increased availability of homeostatic cytokines,89 90 and (3) reduced antitransgene immune-mediated rejection.91 However, lymphodepletion also results in several hematological toxicities (neutropenia, anemia, and thrombocytopenia),92 infectious complications,72 93 94 increased incidence and severity of CRS,3 and, in some cases, tumor lysis syndrome95 (figure 1). Other rare adverse events, including hepatotoxicity96 and leukoencephalopathy,97 have also been reported as associated with lymphodepletion in two clinical trials.
Fludarabine and cyclophosphamide-induced lymphodepletion are used in non-human primates98 and immunocompetent mouse models99 100 to study antitumor activity of CAR T cells, but to date, animal models predicting the toxicity of lymphodepleting regimens are still missing.
Insertional mutagenesis and clonal dominance
Retroviral or lentiviral vectors, which are used to engineer T cells, have been linked to rare cases of insertional mutagenesis in humans, as reviewed recently101 (figure 1). The semirandom integration of the vectors in the genome of host cells can lead to the insertion of enhancer or promoter sequences, or to the disruption of genes involved in cellular proliferation or cancer. Differences in integration site selection have been linked to the differential risk of genotoxicity of vector systems.102 Severe adverse events caused by insertional mutagenesis have occurred so far only with the use of gene-modified hematopoietic stem cells (HSCs), whereas no such evidence was found in the long-term follow-up of patients with retroviral gene-modified T cells.103 However, several instances of vector-induced clonal dominance have been recently reported in CAR T-cell protocols. Insertion of a lentiviral vector into the TET2 gene was observed in a T-cell clone whose expansion was associated with Chronic lymphocytic leukemia (CLL) tumor eradication. The integration of the vector in the TET2 gene spliced out its catalytic portion promoting CAR T-cell proliferation and effective therapy.104 105 Another case of clonal expansion associated with insertion into the E3 ubiquitin-protein ligase CBL(CBL) gene has been reported in CD22 CAR T cells, although it is not clear if CBL gene disruption was directly causing T-cell proliferation.106 Genetic engineering of T cells with the Clustered Regularly Interspaced Short Palindromic Repeats(CRISPR)/Cas system, recently initiated in clinical studies mainly ongoing in China, has provided limited information on safety in humans.107 Notably, a recent study from a phase I clinical trial reports that two patients treated with CD19 CAR T cells engineered by the piggyBac transposon system as a gene transfer tool developed T-cell lymphomas.108 Allogene Therapeutics, a biotechnology company pioneering the development of allogeneic CAR T therapies for cancer (AlloCAR T), has recently reported that, prompted by a chromosomal abnormality found in a bone marrow biopsy taken from a patient following the development of progressive pancytopenia, the Food and Drug Adminsistration (FDA) has placed a hold on the company’s AlloCAR T clinical trials.109 An investigation is currently under way to further characterize the observed abnormality, including any clinical relevance, evidence of clonal expansion, or a potential causative relationship of gene editing using transcription activator-like effector nuclease (TALEN) technology.
Models for genotoxicity including CRISPR and TALEN
Models for assessing genomic integration of a given vector mainly include cell-based assays that were largely derived to model the leukemias observed in HSC-based gene therapy (figure 2). T-cell culture assays were designed to model the recurrent activating vector insertions in the Lmo2 locus noted in early human SCID gene therapy trials. For instance, a murine thymocyte culture assay can reproduce the insertions in Lmo2 as well as other T-cell proto-oncogenes (including Mef2c) that functionally associate with developmental arrest and possible transformation.110 A second assay termed in vitro immortalization assay reports the induction of replating activity in primary murine hematopoietic cells as a result of insertional activation of proto-oncogenes such as Evi1.111 A third assay uses an immortalized murine cell line (BAF3) to measure the frequency of IL-3-independent mutants that arise as a result of vector-induced insertional mutagenesis.112 Probably the most common assay to assess transformation of engineered T cells tests for antigen-independent and IL-2 independent proliferation.
Genetic engineering of therapeutic cell types, including CAR T cells, has entered a new era with the advent of site-specific genomic modifications with the CRISPR/Cas system or with transposases. Although elegant with respect to targeting modifications in a highly precise manner, CRISPR/Cas gene editing is associated with potential side effects stemming from off-target cleavage of the genome, which can be hard to predict. Off-target effects include insertions/deletions (indels) of genomic information at unwanted sites and can lead to chromosomal translocations between on-target and off-target sites or between different off-target sites. There is a battery of in silico, in vitro, and ex vivo methods available for genome-wide assessment of off-target cleavage sites, including the bioinformatics tools MIT CRISPR Design Tool113 or E-CRISP,114 GUIDE-seq,115 Digenome-seq,116 CIRCLE-seq117 and SITE-seq,118 and CAST-seq, which have recently been established to assess chromosomal translocations in edited cells.119 Cumulatively, these methods indicate that (1) the number of off-target sites and the efficiency at which they are modified strongly depend on the individual gRNAs; (2) in silico methods can fail to predict experimentally determined off-target sites (false negatives); and (3) in vitro methods may report false positives and hence lack sufficient specificity.120
We have highlighted in this chapter the different toxicities encountered in patients infused with engineered T lymphocytes. They range from moderate and manageable to very severe and life-threatening. Although some progress has been made in our understanding of these adverse events, there is still a lot to learn about their underlying mechanisms. Current preclinical models, which mainly consist of in vitro coculture systems and standard xenograft models in immunocompromised mice, have mostly failed to predict these toxicities to date. However, we have also presented new tools and models that can better recapitulate toxicities observed in patients.
Identification of gaps in current preclinical models
Gaps in models for CRS and ICANS
Traditional preclinical modeling of toxicities usually assesses predefined subsets of biomarkers such as proinflammatory cytokines for predicting CRS and ICANS.121 However, recent ex vivo analyses revealed a much more complex picture. For example, single-cell RNA sequencing of axicabtagene ciloleucel CD19 CAR T-cell samples identified a rare monocyte-like cell population that was significantly over-represented in the infusion products of patients who developed high-grade ICANS.122 Thus, new generations of preclinical safety models must combine unbiased ex vivo assessment using multiparametric measurements (longitudinal, single-cell, or spatial resolution, if needed), in order to uncover detailed mechanistic insights of engineered T cells that eventually lead to severe toxicities (table 1). Following the conceptional framework of the imSAVAR consortium, a public–private partnership (https://imsavar.eu/), immune-related adverse outcome pathways support scientists of diverse expertise to describe the current knowledge of complex processes leading to toxicities at the molecular, cellular, and organ/organism levels, thus facilitating systematic biomarker development and guidance of preclinical safety model development.
To date, different animal models are required to predict CRS and ICANS due to the need of mirroring complex immune interactions at both cellular and molecular levels. Indeed, the human immune system and tumor microenvironment (TME) are not completely recapitulated in single-animal models requiring a careful selection, depending on the primary objectives of the study. Syngeneic immunocompetent mouse models are biased by the murine nature of engineered T cells and CAR constructs, which prevent firm conclusions from being drawn on the corresponding human CAR T-cell products. For example, the intravenous injection of a humanized superagonistic CD28-specific antibody in humans induced a strong CRS that was not predicted in syngeneic mouse models.123 Subsequently, the different CD28 signaling properties between humans and mice have been associated with a single amino acid variant in the C-terminal proline-rich motif that regulates nuclear factor kappa B(NF-κB) activation and proinflammatory cytokine gene expression.124 Possibly related to this, syngeneic models have been largely unable to mimic CRS induced by CAR T-cell products. Furthermore, adverse events may vary between syngeneic mouse models, as seen when targeting CD19 and NKG2D, so extreme caution is required if considering using only a single mouse strain before moving into the clinic.23 125 126 On the other hand, while standard xenograft models in highly immunodeficient mice (eg, NSG) suffer from the lack and/or dysfunctionality of crucial hematopoietic components such as myeloid cells and cytokines, the use of less immunocompromised mice (SCID-beige) still requires overcoming species-specific barriers, preventing the physiological development of CAR T cell-related toxicities but requiring very high CAR T-cell doses and tumor burdens.31 While HSPC-humanized SGM3 might overcome these limitations, they still suffer from high complexity, heterogeneity, prohibitive costs, and the long time frame required to achieve human reconstitution. In these mice, the employment of xenotolerant T cells generated in a first round of humanization can abate the occurrence of GVHD, allowing for long-term monitoring and eliminating sources of confusion in the interpretation of toxicity.10 However, this approach has limited translational potential due to the impossibility of testing the final patient-derived engineered T-cell products.10 Moreover, limited data are available regarding fully humanized mouse models, with variation in the results probably ascribable to the different source of CD34+ cells. While different models proved useful to study CRS, the advance in animal modeling required to fully recapitulate ICANS development appears more difficult, especially in light of recent reports showing that this reaction may have a complex origin, including both an on-target, off-tumor and a genuine inflammatory component.23 Lastly, reports collecting data using primates are highly demanding in terms of costs and timing and are further biased by the absence of tumor cells,34 36 limiting the proper evaluation of CAR T-cell antitumor potential and related side effects. Moreover, while autologous primate immune cells can be transduced, their characteristics and performance might be different compared with human T cells. Finally, small group sizes are often required when dealing with large animal models due to both ethical and economic reasons.
In vitro models
Two-dimensional (2D) coculture models present several limitations as they do not model well the complexity of the TME. However, additional layers of complexity can be introduced in these models to better mimic the conditions that CAR T-cells encounter in the TME and that lead to CRS and ICANS adverse events (table 1). For instance, inclusion of endothelial cells with the tumor cells in CAR T-cell cocultures would recreate cellular interactions prone to producing inflammatory cytokines measured in the serum of CAR-T cell-treated patients. Three-dimensional (3D) models such as organoids are now widely used to culture cancer biopsies. They possess a number of advantages as they closely resemble many aspects of the patient’s tumor.127 However, organoids usually only contain tumor cells and not immune components. Over the past years, efforts have been made to include immune cells, offering the possibility to recreate a favorable environment to test engineered T-cell efficacy,128 but hopefully also to predict CRS and ICANS.
Gaps in models for on-target and off-target, off-tumor toxicities
The use of syngeneic mouse models can be extremely useful when the expression profile of the target antigen is similar between humans and mice. However, human biology is not always recapitulated by mouse biology, thus limiting insights regarding human CAR T-cell behavior (see also animal models for CRS and ICANS).124 Moreover, off-target toxicity is difficult to assess in mouse xenograft models due to differences in off-target expression profiles, biology and low protein homology between mice and humans. Hence, mouse models generally lack the ability of the human CAR T-cell products to interact with the endogenous mouse tissue for observation of toxicity. Despite their unique utility in studying on-target, off-tumor toxicities, the generation of transgenic immunocompetent mice for multiple human TAAs is a laborious task (table 1). These mice have the great advantage of including a functional immune system and TME. However, apart from the antihuman single-chain fragment variable (scFv), the rest of the CAR construct and the T cells must be of murine origin, hence making it impossible to test the therapeutic potential of clinically relevant fully human T-cell products.
Assessment of target antigen expression
Studies on TCR and CAR targets rarely take into account their body-wide expression, thus underestimating the toxicity risks across normal organs. Efforts like the Human Cell Atlas, whose goal is to generate a large-scale gene expression database across all human cell types, will enable the systemic identification of specific target antigens.129 Traditional immunostaining techniques lack sensitivity and are limited to a few markers on the same tissue section. Novel multiplex imaging approaches (GeoMx, MACSIMA, and MultiOMICs) offer the possibility to precisely locate the target expression in a large panel of normal tissues (table 1). However, the detection of the target antigen in normal tissues is not sufficient to prove that CAR or TCR recognition will have deleterious effects. Monitoring of engineered T-cell responsiveness against human normal cells (2D, 3D, organoids, and organotypical models) represent more predictive approaches. In that regard, the recent advent of ex vivo human models derived from primary tissue explants and biopsies from healthy donors incubated with engineered T cells130 have great potential to test on-target, off-tumor toxicity. This idea has been used to assess the toxicity of CAR-NK92 cells against EGFRvIII and FRIZZLED using antigen-positive colon cancer organoids and normal colon organoids.131 EGFRvIII-CAR showed selective cytotoxicity against colon cancer organoids. However, FZD-CAR showed cytotoxic engagement of target organoids regardless of its origin, highlighting potential on-target, off-tumor toxicity concerns.131
Gaps in models for GVHD and rejection
Currently, several limitations affect the employment of the existing preclinical models to predict GVHD and rejection of infused T cells. While syngeneic and allogeneic mouse models suffer from the impossibility of testing human T-cell products, in standard NSG xenograft models, GVHD manifestations are unlikely compared with those observed in treated patients due to the presence of more complex cellular and molecular interactions (table 1). Indeed, many factors may limit comparability of xenogeneic GVHD data to clinical outcomes, including the lack of control groups treated with GVHD prevention, the use of irradiation as the only source of conditioning, and the homogenous microbiome in mice housed under pathogen-free conditions.132 Additionally, the mechanism of xenogeneic GVHD does not completely recapitulate the underlying pathogenesis in human GVHD. For example, it is donor APCs rather than host APCs that activate human T cells in the xenogeneic GVHD model, whereas host APCs play a significant role in human GVHD.133 On the other hand, only limited data are currently available from more sophisticated animal models, such as HSPC-humanized SGM3 mice, due to their typical time-consuming and cost-prohibitive features. Here, variability in the level of human reconstitution, together with the extent of human cell development represent the major barriers in deciphering reactions against or caused by engineered T cells. In addition, the complexity of these settings is also dictated by the specific time frame in performing the experiments, as definite cellular interactions are only feasible in specific time intervals. Therefore, there is complexity both in setting experimental conditions and in the interpretation of the results. Further investigations are still required to increase the predictive value of these models.
Gaps in models for TCR T-cell toxicities
There are no robust in vivo models to assess the toxicity of TCRs driven by alloreactivity or peptide-specific cross-reactivity. Immunodeficient xenogeneic murine models are often used to obtain evidence of efficacy of TCR-engineered human T cells, but toxicity assessments are limited by the lack of HLA molecules in the host and by the poor persistence of human T cells in mice. Although the use of HLA transgenic mice can overcome some of these limitations, differences in the HLA-presented peptides in transgenic mice and humans remain a substantial limitation in assessing TCR toxicity. Hence, the in vitro platforms described earlier provide the most valuable TCR safety data prior to the progression to clinical trials.
Gaps in models for genotoxicity including CRISPR and TALEN
Both the FDA and the European Medicines Agency (EMA) recommend that the genotoxic impact of genetic engineering be evaluated in gene therapy products before a clinical application can be approved. These evaluations include investigation of abnormal cell behavior following gene modification, identification, and characterization of both intended and unintended genomic alterations and assays of toxicity mechanisms. In the case of genetically modified T cells, none of the functional assays described earlier provide clinically relevant safety measures. Cell-based models are limited by species (generally applicable only to murine cells), cell type (applicable mostly to HSPCs), the genes whose deregulation is predominantly scored (Evi1 or Lmo2), and an easily scorable phenotype (growth factor-independent culture conditions). Animal disease models are by definition heterologous, more labor-intensive and mainly suffer from being dependent on cancer-predisposing genotypes, and thus may not reflect the tumorigenic potential of genetic engineering in primary human cells. Currently, the high-resolution vector insertion site studies are a direct way to assess eventual clonal dominance, but appropriate cell-based tools for the comprehensive and functional assessment of phenotypical consequences of genetic engineering in therapeutically relevant human T-cell types are lacking.
In conclusion of this chapter, the models currently used to predict adverse events associated with engineered T cells have limits which we have discussed. One may simply state that in vitro coculture models are human but not systemic, whereas animal models are systemic but not human. We have highlighted the need to develop alternative approaches (eg, 3D ex vivo human models) to more effectively predict adverse events associated with engineered T cells and to understand the pathophysiological processes of these toxicities.
For granting a clinical trial approval or subsequently obtaining marketing approval, the non-clinical safety testing of engineered T cells requires a tailored toxicity program that considers both the complex nature of these medicinal products and the limitations of currently available animal models. Some of the toxicities indicated previously, including CRS and neurotoxicity, are commonly observed in patients treated with engineered T cells. Moreover, the severities of these toxicities are largely dependent on patient-specific factors such as tumor burden, and thus are difficult to mimic in animal models. Consequently, omission of investigating these potential toxicities in non-clinical studies is widely accepted by regulators. Instead, appropriate risk mitigation strategies including a close monitoring of the treated patients are mandated. Clinical trials that include innovative designs should consider phase I studies characterized by a de-escalating/escalating approach,134 split doses,135 and tumor burden-adjusted doses. Other toxicities, mainly those related to the antigen specificity of engineered T cells (eg, on-target, off-tumor toxicity, cross-reactivity, alloreactivity and potential mispairing of TCR T cells) need to be addressed in non-clinical studies. Pivotal safety studies are expected to be performed in compliance with good laboratory practice, unless the complexity of the model used precludes this.136 For T cells that have been engineered to specifically bind to a target antigen, a detailed analysis of the expression pattern of the target antigen in human cells, tissues, and organs has to be completed using gene expression databases and in vitro analyses. The recently updated EMA guideline on genetically modified cells provides valuable information and dedicates a specific chapter to the non-clinical development of genetically modified immune cells.137 Of notice, both the FDA and the EMA support the 3R principles to reduce, refine, and replace animal use in preclinical development and testing.138 This encouragement is aligned with ethical and animal welfare considerations that demand that animal use during preclinical testing is limited, and preferably avoided, as much as possible. Regulatory acceptance of the 3R method is based on, among other considerations, the availability of defined test methodology including standard protocols with clearly defined and scientifically sound endpoints, as well as on the reliability and robustness of the tests.
Conclusion and future perspectives
Toxicity is a crucial aspect of drug testing and development. Cell therapies relying on engineered T cells are no exception, and very severe adverse events have been observed after infusion in patients. Moreover, in contrast to classic drugs, engineered T cells expand and persist long term in the patient’s body, engaging other immune cells and thereby presenting unique toxicity profiles. As stated in our review, acute toxicities of CAR T cells were first observed clinically and not in preclinical models, which at the time were mostly designed for efficacy testing and mainly consisted of in vitro coculture systems and in vivo models established in immunocompromised mice. Clearly, these models have limits.
There is an urgent need to develop preclinical assays that more effectively predict adverse events associated with engineered T cells but also to understand the pathophysiological processes of these toxicities. Over the past few years, efforts have been made to generate HSPC-humanized mouse models that have proven very useful to predict CRS and neurotoxicity observed in patients after CAR T-cell infusions. Although animals represent multisystem organisms, humanized mice still exhibit deficits in the development of a complete human immune system. Moreover, due to their high costs, complexity of implementation, and ethical considerations, it is important to develop alternative approaches. The recent advent of ex vivo human models, especially organoids and organotypical systems, offers a great opportunity as they can emulate human biology and in principle predict, in a personalized manner, some of the toxicities elicited by engineered T cells. Of course, there are fundamental challenges when working in vitro, one being the quantitative translation from an in vitro to an in vivo effect, the other one having all cells and components (eg, myeloid cells and endothelial cells) responsible for adverse events induced by engineered T cells present in the models. Armoring of engineered T cells (eg, with a dominant negative TGF-β receptor139) further increases complexity and creates additional need for better translatable models.
In conclusion, predicting engineered T-cell toxicities using preclinical models is still in its infancy. Due to its complexity, engineered T-cell safety assessments should not rely on a single model but span a large battery of in silico, in vitro, and in vivo tools. Importantly, next-generation models should be designed as screening tools for both efficacy and toxicity testing of new engineered T cells. There are indications that innovative animal models, such as the humanized SGM3 model, are sensitive enough to appreciate fine/slight differences between CAR T cells generated from different starting cell sources.140 However, the robustness of such animal models in comparing cell products with similar properties remains to be verified. The validation of existing and novel preclinical models will contribute to the selection of cell products with improved safety and enhanced therapeutic value.
Patient consent for publication
This study does not involve human participants.
This project received funding from the Innovative Medicines Initiative 2 Joint Undertaking (grant agreement number 116026). This Joint Undertaking receives support from the European Union’s Horizon 2020 Research and Innovation program and European Federation of Pharmaceutical Industries and Associations (EFPIA).
MA, BA, SA, CB, BDA, RC, DE, AG, CH, ZI, CK-M, MJK, UK, CK, BL, FL, IM, JM, MAM, EM, HN, CQ, MR, KR, MR, ER, CS, HS, MT and JVdB contributed equally.
Contributors ED and MC conceived, contributed to and revised the manuscript. MH revised the manuscript. ML assisted with figure processing. All the other authors contributed equally and are listed in alphabetical order.
Competing interests ML is an inventor on a patent application related to CAR T-cell therapy filed by Philipps-University Marburg and the University of Würzburg. SA is an inventor of a patent in the field of adoptive T-cell therapy. CB received a research contract from Intellia Therapeutics and participated in the advisory boards of Molmed, Intellia Therapeutics, TxCell, Novartis, GSK, Allogene, and Kiadis, and is an inventor of patents in the field of adoptive T-cell therapy. DE’s PhD is cofunded between the academic lab led by ED as PhD supervisor and the industrial partner Invectys. ER is an inventor of a patent in the field of adoptive T-cell therapy. MT holds licensed patent related to CAR T cells. MH is an inventor on patents related to CAR T-cell therapy filed by the University of Würzburg. MC is an inventor of patents in the field of adoptive T-cell therapy. CH is an employee of Janssen R&D and shareholder of Johnson & Johnson stock. IM, BL, CH, and HN are full-time employees of Servier. RC, CKa, and JM are employees of Takeda Pharmaceuticals. MJ discloses research support from Kite/Gilead; honoraria for advisory boards, presentations and travel support from Kite/Gilead, Novartis, Celgene/BMS and Miltenyi Biotech (all to institution). All other authors state no potential competing interests.
Provenance and peer review Not commissioned; externally peer reviewed.