Article Text

Original research
Predicting cardiac adverse events in patients receiving immune checkpoint inhibitors: a machine learning approach
  1. Samuel Peter Heilbroner1,
  2. Reed Few1,
  3. Tomas G Neilan2,
  4. Judith Mueller1,
  5. Jitesh Chalwa1,
  6. Francois Charest1,
  7. Somasekhar Suryadevara1,
  8. Christine Kratt3,
  9. Andres Gomez-Caminero3 and
  10. Brian Dreyfus3
  1. 1Data Science, ConcertAI, New York, New York, USA
  2. 2Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
  3. 3Bristol Myers Squibb, New York, New York, USA
  1. Correspondence to Dr Samuel Peter Heilbroner; sheilbroner{at}


Background Treatment with immune checkpoint inhibitors (ICIs) has been associated with an increased rate of cardiac events. There are limited data on the risk factors that predict cardiac events in patients treated with ICIs. Therefore, we created a machine learning (ML) model to predict cardiac events in this at-risk population.

Methods We leveraged the CancerLinQ database curated by the American Society of Clinical Oncology and applied an XGBoosted decision tree to predict cardiac events in patients taking programmed death receptor-1 (PD-1) or programmed death ligand-1 (PD-L1) therapy. All curated data from patients with non-small cell lung cancer, melanoma, and renal cell carcinoma, and who were prescribed PD-1/PD-L1 therapy between 2013 and 2019, were used for training, feature interpretation, and model performance evaluation. A total of 356 potential risk factors were included in the model, including elements of patient medical history, social history, vital signs, common laboratory tests, oncological history, medication history and PD-1/PD-L1-specific factors like PD-L1 tumor expression.

Results Our study population consisted of 4960 patients treated with PD-1/PD-L1 therapy, of whom 418 had a cardiac event. The following were key predictors of cardiac events: increased age, corticosteroids, laboratory abnormalities and medications suggestive of a history of heart disease, the extremes of weight, a lower baseline or on-treatment percentage of lymphocytes, and a higher percentage of neutrophils. The final model predicted cardiac events with an area under the curve–receiver operating characteristic of 0.65 (95% CI 0.58 to 0.75). Using our model, we divided patients into low-risk and high-risk subgroups. At 100 days, the cumulative incidence of cardiac events was 3.3% in the low-risk group and 6.1% in the high-risk group (p<0.001).

Conclusions ML can be used to predict cardiac events in patients taking PD-1/PD-L1 therapy. Cardiac risk was driven by immunological factors (eg, percentage of lymphocytes), oncological factors (eg, low weight), and a cardiac history.

  • programmed cell death 1 receptor
  • lung neoplasms
  • immunotherapy

Data availability statement

Data may be obtained from a third party and are not publicly available.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Immune checkpoint inhibitor (ICI) therapy with programmed death receptor-1 (PD-1) or programmed death ligand-1 (PD-L1) inhibitors has dramatically improved outcomes in patients with cancer.1–8 However, ICI therapy has also been associated with several immune-related adverse events (irAEs).9 Cardiac irAEs with ICI therapy are not common but have a mortality rate of up to 30%,9 and the occurrence of cardiac events with ICI therapy is likely under-reported.10–14 Myocarditis is the most well-described cardiac irAE with ICI therapy15–19; however, cardiac irAEs such as pericarditis, pericardial effusions, and acute vascular events are increasingly described.11 12 15 20–22

The risk factors for the development of adverse cardiac events with ICI therapy are poorly understood.14 Beyond combination immune therapy,15 there are no established risk factors for cardiac events with ICI therapy. Identification of patients at increased risk of ICI-related cardiac side effects may support testing of rational surveillance strategies in such high-risk populations. However, efforts to identify a high-risk population have been limited by the sample size of most of the studies to date and identification of the optimal approach for such risk stratification. Machine learning (ML) has improved our ability to predict outcomes in oncology.23 Furthermore, ML techniques better account for the complex interaction between risk factors which have gone relatively unevaluated so far. Using Shapley additive explanations (SHAP), ML can be used to discover novel insights into the disease process.24–27 Therefore, we created an ML decision tree model for predicting cardiac events among patients being treated with a PD-1 or PD-L1 inhibitor in the CancerLinQ database. We also used SHAP to identify novel risk factors for future cardiac events.28


To predict cardiac events, we trained an XGBoosted decision tree model on all patients with non-small cell lung cancer (NSCLC), melanoma, and renal cell carcinoma (RCC) who had PD-1/PD-L1 therapy.

Data source

Our model was trained on ConcertAI data. ConcertAI derives its data from several sources, most notably the CancerLinQ database (American Society of Clinical Oncology), which in turn aggregates data from oncological practices across the USA. ConcertAI receives copies of each oncological practice’s complete electronic medical record (EMR), including unstructured notes and scanned reports. Structured data from the EMR such as ICD codes, medications, and laboratory values are extracted programmatically, combined with important hand curated variables from provider notes, and entered into a standardised database which can be easily analysed and queried. Data are provided as far back as the practices have history.

This real-world dataset provides information on patient demographics, tumor characteristics, comorbidities, treatment information, treatment toxicities, and outcomes. A subset of patients with at least 10 documents in their chart were selected for deep curation, a process in which a team of oncology nurses read a patient’s notes and abstracted additional information, including treatment toxicities and tumor response/progression information. Our final analysis was conducted only on patients who underwent curation.

Cohort selection and labeling

Our cohort included all patients with NSCLC, melanoma, and RCC who (1) underwent nurse curation of treatment toxicities, (2) did not have evidence of a second malignancy, (3) had PD-1/PD-L1 therapy regardless of stage, and (4) were at least 18 years old at index. Index was defined as the date of first PD-1/PD-L1 administration. If a patient received multiple distinct PD-1/PD-L1 drugs, the last PD-1/PD-L1 drug taken was used for indexing. This was done to maximize the number of patients with combination immunotherapy available for analysis.

A cardiac event was defined as the first documentation of arrhythmia (eg, complete heart block and ventricular fibrillation), heart failure, myocarditis, or pericardial disease as defined by the Medical Dictionary for Regulatory Activities (MedDRA) and International Classification of Diseases (ICD) codes (table 1). Codes from both the native EMRs and nurse curation were included to define endpoints. Mild or non-specific cardiac conditions, such as chest pain, were not included because they are difficult to attribute to a new onset cardiac event and may represent documentation of alternate etiologies. Some providers routinely document chronic conditions in a patient’s chart. To ensure that all our events represented new cardiac disease, we did not include an event if the patient had a history of similar events prior to the index date. For example, if a patient had a history of heart failure prior to starting immunotherapy, heart failure ICD codes were ignored when defining cardiac events for this patient. ICD and MedDRA codes were defined as similar if they fell into the same category (table 1). If a patient had any ICD or MedDRA code for one of the conditions listed in table 1 in the 6 months prior to the index date, then they became ineligible for a cardiac event in the same category.

Table 1

To avoid missing cardiac events, patients were censored after their last date of continuous follow-up. Continuous follow-up was defined as at least two EMR entries within a 6-month period. To better establish a causal relationship between PD-1/PD-L1 therapy and cardiac events, patients were also censored 100 days after their last PD-1/PD-L1 administration.

Time windowing

To avoid the effects of immortal time bias29 and reverse causality, most patient variables were calculated using data from 6 months before index to index. For vitals, laboratory results, and Eastern Cooperative Oncology Group (ECOG), some patients had missing values if we did not include data from immediately after the index date. To minimize bias secondary to missing data, we allowed these values to be calculated using data from up to 30 days after index. If a patient suffered a cardiac event within 30 days of index, the data were truncated 15 days before the cardiac event. To better distinguish between baseline and on-treatment laboratory values, we conducted a sensitivity analysis using only data that was strictly before the index date. The results are reported in the appendix (online supplemental figure 1) and did not change significantly from our primary analysis.

Supplemental material

Model variables

A total of 356 variables were abstracted from the data for modeling. Demographics, such as age at diagnosis, sex, race, and ethnicity, were extracted directly from the electronic medical record. Cancer type was abstracted through nurse curation and documented using ICD codes. A patient’s pseudostage was defined as stage 4 if the patient had progressed to metastatic disease by index date and their stage at diagnosis otherwise. Because patients with NSCLC represented such a large percentage of our cohort, we also included NSCLC histology as a covariate by mapping ICD for Oncology, Third Edition codes in the data to categories defined in the Surveillance, Epidemiology, and End Results Database guidelines.30 When a patient had multiple records which disagreed (eg, multiple different histology codes), the record closest to the index was used for analysis.

Comorbidities were calculated using ICD codes documented in the EMR and by nurse curators. The Charlson Comorbidity Index was calculated using the method described by Deyo.31 Common comorbidities, such as congestive heart failure, were calculated using ICD codes enumerated in the Centers for Medicare and Medicaid Services’ Chronic Conditions Data Warehouse algorithm.32 Autoimmune conditions were calculated using codes listed in online supplemental table 1. ECOG performance status was taken directly from EMR records and curated data. Smoking status was binned into current-smoker, ex-smoker, and never-smoker categories. When a single patient had multiple discordant smoking statuses, the worst was used for analysis (ie, current smoker>ex-smoker>never smoker).

Vitals, including heart rate, respiratory rate, blood pressure, O2 saturation, temperature, weight, and body mass index (BMI) were collected using Logical Observation Identifiers Names and Codes (LOINC) codes (online supplemental table 2). Change in weight per day (lb/day) was calculated as the slope of the trend line fitting the patient’s weight in the 6 months prior to diagnosis. A negative slope indicated that the patient was losing weight; a positive slope indicated that they were gaining weight. Common laboratory and panel values, including complete blood count (CBC), basic metabolic panel (BMP), magnesium, phosphorous, liver function tests, lactate dehydrogenase, B-type natriuretic peptide (BNP), troponin, erythrocyte sedimentation rate (ESR), C-reactive protein, prothrombin time, partial thromboplastin time, international normalised ratio, D-dimer, and fibrinogen, were also collected using LOINC codes (online supplemental table 2). These laboratory parameters were either recorded prior to the start of an ICI or within 30 days after start of an ICI. The absolute neutrophil count and absolute lymphocyte count (ALC) were calculated using the corresponding percent values from the CBC and the white blood count. PD-1/PD-L1 expression was available directly in the data.

Using RxNorm codes, all of a patient’s oncological and non-oncological medications were collected. To simplify the model, medications of the same class (eg, platinum-based chemotherapies, angiotensin-converting enzyme inhibitors) were grouped together using the Food and Drug Administration (FDA) Established Pharmacology Class (EPC).33 The EPC classification system already includes categories for PD-1, PD-L1, and cytotoxic T-lymphocyte-associated protein 4 (CTLA-4) therapies. However, we also included the number of distinct immunotherapies a patient has undergone to explicitly model the interaction between these drugs.

Missing data

Because our data were derived from oncology clinics, many of the non-oncological laboratory tests (eg, troponin, BNP) were missing. XGBoost naturally handles missing data. However, because patients with cardiac events after index date but before 30 days have passed have their data truncated, missing data were erroneously associated with early adverse events. As a sensitivity analysis, we masked missing values with multivariate imputation by chained equations using the 10 nearest features.34 We also analysed how the model changed when we excluded features with missing rates above 70%, 80%, and 90%.


We used an XGBoosted decision tree to predict time to cardiac events. To properly account for censoring, we modified the model’s loss function to the same one used in the Cox proportional hazard (CoxPH) model.35 36 For each patient, the model predicted an HR. As in a CoxPH model, a higher HR indicated greater risk. The tree depth and learning rate were tuned to 2.0 and 0.15, respectively, based on a stochastic search with cross validation (n=10).

Validation and performance

The data were randomly divided into training (80%) and testing (20%) sets. The model’s ability to predict cardiac events within 20, 40, 60, 80, 100, 120, and 140 days of index was assessed using area under the curve–receiver operating characteristics (AUC-ROCs).37 We also calculated the concordance index, which represents the percentage of time a model, given a random pair of patients, correctly predicts which patient will have a cardiac event first. Finally, we divided the PD-1/PD-L1 test set into two equal-sized high-risk and low-risk subcohorts. The high-risk group had predicted HRs above the median, and patients in the low-risk group had HRs below the median. The time elapsed prior to an adverse cardiac event for these two subcohorts was compared using the cumulative incidence function and the Aalen-Johansen estimator.

Model interpretation and statistical analysis

We used SHAP to determine the variables most predictive of cardiac risk.38 Bootstrapping (n=500) was used to calculate p values for each feature under the null hypothesis that the mean absolute SHAP score for that feature was zero.


There were 4960 patients in our cohort. Of these patients, 4103 were treated for NSCLC; 534 were treated for melanoma; and 323 were treated for RCC. The median follow-up time was 154 (IQR 57–342) days. A total of 418 patients experienced a cardiac event before they were censored. The most common cardiac events were atrial fibrillation (34%), heart failure (23%), and pericardial disease (20%). The rates of all cardiac events are listed in table 1. Model training was performed in 80% of patients (3937 PD-1/PD-L1 patients). The remaining patients were used for model validation. We also conducted several sensitivity analyses where we included ACS related events or removed atrial fibrillation from the list of events. The results were unchanged.

Demographics are shown in table 2, stratified by whether the patient had a cardiac event within 100 days or not. Patients censored before 100 days are not included in table 2 because one cannot determine if they would have had a cardiac event after they were censored. However, censored patients were not excluded from the modeling because the CoxPH loss function can adjust for this source of bias. Compared with those without an event, patients with a cardiac event were more likely to be white (75% vs 68%) or African–American (15% vs 14%), male (61% vs 53%), have lung cancer (88% vs 80%), be at a higher stage, be a current smoker or ex-smoker, have chronic obstructive pulmonary disease (41% vs 31%), have a higher ECOG status, or have a higher Charlson score. These results were all significant at a p<0.05 level.

Table 2

Demographics of patients in our cohort, stratified by cardiac event

The rates of missing data varied by feature (full results in online supplemental table 3). In general, missing data in demographic variables (such as age) were negligible. Common vital signs and laboratory values (like BMP and CBC) had missing rates between 10% and 60%. Laboratory tests that are less commonly ordered in the oncology clinic (such as ESR, BNP, and troponin) had missing rates of as high as 99%. Despite these high missing rates, we opted to include these variables in the model because missingness can be an informative feature. We conducted several sensitivity analyses where we excluded features with missing rates above 70%, 80%, and 90%. We also tried imputing missing values with multiple imputation with chained equations. The results of these sensitivity analyses are shown in the appendix (online supplemental table 4 and figures 2–5) and did not substantially differ from our primary analysis.

Using SHAP, we identified several variables that helped predict time to adverse cardiac event. The 40 variables with the highest mean absolute SHAP score are shown in figure 1. Some of the variables associated with an increased risk of cardiac events are as follows: increased age (p<0.002); immunological labs including a lower baseline or peri-index percentage of lymphocytes (p=0.002), a higher percentage of neutrophils (p=0.02), a lower absolute lymphocyte count (p<0.002), a higher absolute neutrophil count (p=0.006), and a higher WBC (p=0.02); administration of immunomodulators including absence of corticosteroids (p=0.008) and receipt of a PD-L1 antibody instead of a PD-1 antibody (p=0.028); a higher platelet count (p<0.002); vital signs such an abnormal (high or low) heart rate (p<0.002), abnormal temperature (p<0.002), abnormal weight (p=0.008), increase in weight over time (p=0.002), or abnormal BMI (p=0.01); various laboratory abnormalities and medications associated with heart failure including not receiving an ACE inhibitor (p=0.026), not receiving a loop diuretic (p=0.01), lower hemoglobin (p<0.002), lower sodium (p<0.002), and lower chloride (p<0.002); liver abnormalities including a lower aspartate aminotransferate (AST) (p=0.01), a lower alanine aminotransferase (ALT) (p=0.002), and lower alkaline phosphatase (p=0.02); renal abnormalities including a lower creatinine (p<0.002) or higher blood urea nitrogen (BUN) (p=0.002); a missing prothrombin time (p=0.004); and a higher lactate dehydrogenase (LDH) (p=0.004). Note that although therapy with PD-1 therapy significantly impacted cardiac risk, it was not a top 40 predictor and cannot be seen in figure 1. A complete list of all significant risk factors is listed in online supplemental table 5. To determine the clinical significance of vitals sign changes detected by the model, we examined the model’s trees by hand: heart rate had three major cut-offs: ~60, ~100, and ~150 beats/min; temperature had cut-offs at around 36.1°C and 37.8°C.

Figure 1

SHAP summary plot for interpreting the impact of features on our model. Each row shows the impact of a single feature on the model’s predictions. Within each row, each dot represents a patient. Red means patients had a high feature value; blue means patients had a low value; gray means patients had a missing value. The position of the dot along the x-axis indicates whether that feature increased or decreased a patient’s predicted risk. When all the red dots are on the right, a high feature value was associated with increased risk. When all the blue dots are on the right, a low feature value increased risk. Statistical significance is indicated as follows: *p<0.05, **p<0.01, ***p<0.002. BMI, body mass index; Cr, Creatinine; DBP, Diastolic Blood Pressure; SBP, Systolic Blood Pressure; Hb, hemoglobin; SHAP, Shaply additive explanations.

To maximize model performance, we included laboratory, vitals, and ECOG values up to 30 days after index. However, this choice also limits our ability to interpret which variables contribute most to cardiac risk. In general, we found that ~75% of these values were baseline and ~25% of values were in the immediate postindex period. As a sensitivity analysis, we repeated the model using only values that were before index. The resulting SHAP plots are shown in online supplemental figure 1. Results did not vary substantially from our primary analysis.

The model’s ability to predict cardiac events within 20, 40, 60, 80, 100, 120, and 140 days of index is shown in figure 2. At 100 days, the AUC-ROC was 0.65 (95% CI 0.58 to 0.75). The model had a concordance index of 0.66 (95% CI 0.57 to 0.71). Using our model, we divided patients into low-risk and high-risk subgroups. The cumulative incidence of cardiac events in these two groups is shown in figure 3. At 100 days, the cumulative incidence of cardiac events was 3.3% in the low-risk group and 6.1% in the high-risk group (p<0.001), 80% higher.

Figure 2

Plot of the cumulative dynamic AUC-ROC of our model on PD-1/PD-L1 patients.61 Model’s ability to predict cardiac adverse events within 20, 40, 60, 80, 100, 120, and 140 days of index is shown. The model’s performance varied between 63% and 72% as the time window for predictions changed. AUC-ROC, area under the curve–receiver operating characteristic; PD-1, programmed death receptor-1; PD-L1, programmed death ligand-1.

Figure 3

Cumulative incidence of cardiac events in low-risk and high-risk PD-1/PD-L1 patients from the test set. Groups were stratified by our model’s median predicted HR. The cumulative incidence function was calculated using the method described by Aalen and Johansen, taking into account the competing risk of death.62 High-risk patients had a significantly higher incidence of cardiac events. PD-1, programmed death receptor-1; PD-L1, programmed death ligand-1.

In the real clinical setting, not all the 356 variables in our model would always be available. Therefore, we tested how our model performed if only the top 20 features from our SHAP plots were used. The AUC-ROC of this simpler model was 0.65 and the concordance index was also 0.65. Complete results are shown in online supplemental table 1.

Our model evaluated several immunotherapy-related factors, including the number of distinct immunotherapies undergone, the type of immunotherapy, and PD-1/PD-L1 tumor expression values. Only receipt of PD-1 versus PD-L1 therapy was significant.


We leveraged a large cross-sectional database of patients with cancer and created an ML model that predicts cardiac events in patients taking PD-1/PD-L1 therapy. The model had solid performance (AUC-ROC=65% at 100 days from index). We also showed that a simpler model using only the top 20 features had comparable performance. In practice, a provider could use this simpler model to enter values by hand or integrate the more complex model into their EMR, thus avoiding manual data entry. Because our model can handle missing values and was trained on a real world dataset that replicates the distribution of missing values seen in clinical practice, a provider need not order any additional laboratory or imaging tests to use our model. These values may simply be entered as missing.

The results are complementary and additive to what has previously been reported in the literature. For example, Kartolo et al were able to predict any irAE with an AUC-ROC of 82.4% using clinically driven factors, similar to our model.39 Our performance was probably lower than that found by Kartolo et al because our model was predicting very specific and rare cardiac events. Using SHAP, we also elucidated several novel risk factors for cardiac adverse events. In general, a combination of immunological factors, oncological factors, and cardiac history was associated with subsequent cardiac events.

We found that patients with a higher percent lymphocyte count, lower percent neutrophil count, or lower platelet count were at lower risk of cardiac disease. The neutrophil to lymphocyte ratio (NLR) is thought to be a marker of the inflammatory tumor microenvironment and therefore could represent a patient’s potential risk of ICI-mediated cardiac toxicity. Traditionally, a low NLR has been associated with both a higher risk of toxicity and a higher response rate.40 We found that patients with a low NLR were at decreased risk of cardiac events. This suggests that in patients with a low NLR, ICI therapies’ ability to treat the tumor and reduce cardiopulmonary burden is protective against cardiac events. Platelets are known to express PD-1 receptors and interact with immunotherapy41 and have traditionally been associated with a better response and higher rates of toxicity.40 This is consistent with our findings.

In our model, corticosteroids were associated with a lower rate of cardiac events. The use of corticosteroids in patients on immunotherapy is well studied.42 PD-1 therapy (vs PD-L1) was also associated with a lower risk of cardiac disease. Traditionally, treatment with PD-1/CTLA-4 combined therapy has been associated with the highest risk of myocarditis.15 However, treatment with PD-1 therapy has also been associated with lower rates of myocarditis.15 Given the borderline significance (p=0.028) of this result, low number of PD-L1 patients, and potential for a type I error in the setting of multiple comparisons, it is important to be cautious when interpreting this result.

We identified abnormal heart rate, both high and low, and abnormal temperature as predictors of cardiac events. Baseline vital signs help identify baseline disease in the patient, even if the patient does not carry a formal diagnosis with an ICD code for that condition. For example, our final model identified patients with a heart rate of <60 or >100 beats/min as being at high risk of a future cardiac event. Having a heart rate outside of the normal range could suggest subclinical arrhythmias with an abnormal rate (<60 and >100 beats/min) or more severe heart failure (>100 beats/min). It is interesting that blood pressure was not a significant risk factor, while heart rate was. Although blood pressure can be abnormal in heart failure, we believe that heart rate is a more indicative finding. This is supported by the Boston criteria, which use tachycardia to diagnose heart failure but do not use blood pressure.43 Our results also identified low temperature as a risk for cardiac events. A low body temperature has previously been associated with a low cardiac output and poor prognosis in patients with heart failure.44

We identified several parallel and aligned factors related to heart failure and severity of heart failure that were associated with cardiac events after PD-1/PD-L1 therapy. Specifically, an elevated heart rate, lower hemoglobin, lower sodium, and lower chloride are all variables associated with the severity of heart failure and were also associated with cardiac events in our study. We found that a high BUN and low creatinine also increased cardiac risk. The BUN to creatinine ratio has long been associated with heart failure severity. ACE inhibitors and loop diuretics were associated with a decreased rate of cardiac events. The novel finding of an association between features of heart failure and future events after PD-1 and PD-L1 inhibitory therapies is logical as a high incidence of the events post-PD-1 and PD-L1 therapies is related to heart failure. PD-1 and PD-L1 therapies may associate with heart failure through a variety of mechanisms. The most common mechanism is through the development of an acute and fulminant myocarditis.14 The second, and less common and less well-understood mechanism, is via a progressive decline in the left ventricular ejection fraction without a clear acute fulminant process. Independent of the mechanism, the likelihood is that an impaired baseline cardiac reserve increases the probability that cardiac injury with an ICI will become clinically manifest as a heart failure event. What is unclear, and needs to be the focus of future studies, is whether the baseline elevated risk with heart failure is related to heart failure with a reduced ejection fraction, heart failure with a preserved ejection fraction, or both, and how baseline cardiac medications such as inhibitors of the angiotensin system or beta-blockers for heart failure may attenuate this risk.

The most common cardiac event on an ICI was the occurrence of atrial fibrillation. Atrial fibrillation and heart failure frequently coexist. We also found that the presence of a prothrombin time lab measurement was predictive and this may relate to the use of anticoagulation in the setting of atrial fibrillation.

Additionally, advanced age was associated with a higher rate of cardiac events in our model. Age-related immunosenescence is generally thought to increase irAE risk through a paradoxically higher concentration of inflammatory cytokines and autoantibodies.45 Abnormal weight/BMI (high or low) and higher LDH levels also predicted cardiac disease (p<0.002). Weight loss can be driven by a variety of factors in patients with cancer, but immunological derangements mediated by cytokines are thought to be a primary driver.46–54 Weight loss is also a general indicator of cancer severity55 as are higher LDH levels.56 Conversely, the association between an elevated weight/BMI and cardiac disease is well known. By checking the trees of our XGBoost model by hand, we identified a BMI of >40 as a cut-off for increased risk of cardiac disease.

Our study does have several limitations. To define cardiac events, we used both curated adverse events that were documented by a team of nurse abstractors and ICD codes documented in the patient chart. The frequency of different cardiac events is described in table 1. In our study, 8% of patients had a cardiac event. In general, we found a high incidence of atrial fibrillation, heart failure, and pericardial disease in our cohort. While the rate of myocarditis in ICI patients is thought to be less than 1%,57 the rate of other cardiac events secondary to ICI-therapy is still unknown. A recent meta-analysis found that 0.5% of patients with cancer treated with ICIs developed myocarditis, 0.3% developed heart failure, and 4.6% developed atrial fibrillation. In addition, pericardial effusion occurred in 0.5% of patients, cardiomyopathy in 0.3% of patients, myocardial infarction in 0.4% of patients, and cardiac arrest in 0.4% of patients.58 Our findings are in line with other studies using EHR data. For example, Waheed et al reported 15% of ICI patient developed new cardiac disease: 0.4% developed cardiomyopathy; 5% developed heart failure; 6% developed an arrhythmia; 2% developed pericardiac disease; 2% developed heart block; and 0.2% developed myocarditis.59

It is possible that pre-existing atrial fibrillation/heart failure is being routinely documented by a patient’s oncologist, even if the patient is not experiencing new disease. To address this concern, we did not count ICD/MedDRA codes as cardiac events if similar codes were already part of the patient’s medical history. However, whether these cardiac events truly represent new heart failure, an exacerbation of existing heart failure, or routine documentation of existing disease is impossible to tell with real-world data and is a major limitation of our study.

Another limitation is that the ConcertAI database does not have robust documentation of Common Terminology Criteria for Adverse Events, and it is impossible for us to distinguish between severe and non-severe events using the traditional definition.

Finally, while demographics and oncological fields like stage had a very low rate of missingness, laboratory values and vital signs suffered high missingness rates of between 20% and 99%. Traditionally, high missingness rates can introduce bias because (1) patients with missing values have to be excluded from the analysis or (2) missing values are replaced with imputed values. In our case, neither of these mechanisms are possible. Because XGBoost treats missing values as a separate entity, no patients are excluded in the modeling process and no values need be imputed. Missing values are simply treated as another value in SHAP plots and can be interpreted separately from the non-missing values. Indeed, XGBoost has been previously sited for its ability to robustly manage missing data in large EMR datasets.60 In light of these findings, we did not remove features with high missingness rates from our final model. However, in several sensitivity analyses, we did show that removing variables with high missingness or imputing missing values has a minimal impact on our results.

To minimize missing data, we included data from up to 30 days after start of ICI. With this analysis alone, it is unclear whether the laboratory, vital, or other patient measurements were baseline or on-treatment. Therefore, we conducted a sensitivity analysis using only baseline values. The performance of our model decreased minimally, but interpretations with SHAP remained largely the same, suggesting that interpreting these values as baseline and not on-treatment is most appropriate. This is also supported by our general observation that ~75% of these values were collected before index date.

To summarize, ML was able to predict cardiac adverse events with a high performance. Using SHAP, we identified multiple risk factors for cardiac events including immunological labs and medications, oncological factors, and elements of the patient cardiac history. Further research is needed to triage PD-1/PD-L1 patients’ risk of cardiac disease.

Data availability statement

Data may be obtained from a third party and are not publicly available.

Ethics statements

Patient consent for publication


Editorial support was provided by Rodney J Moore, PhD (Bristol Myers Squibb) under the direction of the authors.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Contributors All authors contributed to this work and are responsible for it in its entirety.

  • Funding TGN was supported, in part, through the Kohlberg Foundation, NIH/NHLBI (RO1HL130539, RO1HL137562, and K24HL150238), and NIH/Harvard Center for AIDS Research (P30 AI060354).

  • Competing interests TGN has received advisory fees from BMS, H3 Biomedicine, Amgen, AbbVie, and Intrinsic Imaging and grant support from Astra Zeneca.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.