Article Text
Abstract
Background Immune checkpoint inhibitors (ICIs) have become a pillar of cancer therapy. However, they are associated with immune-related adverse events (irAEs), which pose significant barrier to ICI usage. ICI-induced inflammatory arthritis (ICI-IA) is one of these irAEs. Research into ICI-IA is still nascent and identifying ICI-IA patients poses a critical challenge for future research efforts and patient care. In this study, we describe the development of a machine learning strategy to identify ICI-IA in the electronic health record (EHR), capture key clinical features, and determine associations with other irAEs, cancer type, and ICI therapy.
Methods We conducted a retrospective study of 89 patients who developed ICI-IA out of 2451 patients who received ICI therapy between March 2011 to January 2021. Logistic regression and random forest machine learning models were trained on all EHR diagnoses, labs, medications, and procedures to identify ICI-IA. Multivariate logistic regression was used to determine the association between ICI-IA and cancer type, ICI therapy, and comorbid irAEs compared to 871 ICI patients who did not develop ICI-IA.
Results Logistic regression and random forest models achieved high performance, with receiver operating characteristic curve area under the curve (ROCAUC) of 0.804 (95% Confidence Interval [CI] 0.745–0.863) and 0.805 (95% CI 0.751–0.859), respectively (table 1). Top features from the random forest model included ICI-IA relevant features (joint pain, steroids, rheumatoid factor) and features suggesting comorbid irAEs (thyroid function tests, pruritis, triamcinolone) (figure 1). ICI-IA patients had increased odds of developing cutaneous, odds ratio (OR) = 2.66 (95% CI 1.63–4.35), endocrine, OR = 2.09 (95% CI 1.15–3.80), or gastrointestinal irAE, OR = 2.88 (95% CI 1.76–4.72), adjusting for sex, age, race, ethnicity, cancer type, and ICI regimen (table 2). Melanoma, OR = 1.99 (95% CI 1.08–3.65) and renal cell carcinoma patients, OR = 2.03 (95% CI 1.06–3.84), were more likely to develop ICI-IA compared to lung cancer patients. Patients on nivolumab+ipilimumab were more likely to develop ICI-IA compared to patients on pembrolizumab, OR = 1.86 (95% CI 1.01–3.43) (table 3).
Conclusions Our machine learning models can identify patients with ICI-IA using structured EHR data. Patients with ICI-IA had increased odds of developing cutaneous, endocrine, and gastrointestinal irAEs, independent of cancer type and ICI regimen. To our knowledge, this is the first study that has conducted tests of association of irAEs comparing ICI-IA to a control cohort of ICI patients without ICI-IA, with both cohorts fully adjudicated for irAEs.
Performance metrics for Logistic Regression and Random Forest machine learning models
Odds ratio (OR) of developing irAFs given ICI-induced inflammatory arthritis (ICI-IA). Cases were 89 patients with ICI-IA. Controls were 871 patients without ICI-IA. *P-values < 0.05 significant (bold)
Odds developing ICI-induced inflammatory arthritis (ICI-IA) given cancer and first ICI. *P-values < 0.05 significant (bold)
EHR code feature importance and association with ICI-induced inflammatory arthritis (ICI-IA). Left: feature importance for identifying ICI-IA patients by the random forest model. Right: odds of the patient having an EHR feature if they developed ICI-IA versus if they did not. By Fisher Exact test. Error bars are 95% confidence intervals. ‘ICI-IA’ codes are those directly relevant to ICI-IA. ‘irAE’ codes are those potentially describing other irAEs. ‘Other’ codes are those describing other parts of the patient medical history. The top codes are predominantly ICI-IA relevant codes, with a high concentration of relevant codes occupying the topmost importance. The top irAE related codes are endocrine (cortisol, thyroid function tests, estradiol), myositis, and triamcinolone and pruritis. The majority of the top codes are positively associated with ICI-IA
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See http://creativecommons.org/licenses/by-nc/4.0/.