TY - JOUR
T1 - Comparative Effectiveness of Machine Learning Approaches for Predicting Gastrointestinal Bleeds in Patients Receiving Antithrombotic Treatment
AU - Herrin, Jeph
AU - Abraham, Neena S.
AU - Yao, Xiaoxi
AU - Noseworthy, Peter A.
AU - Inselman, Jonathan
AU - Shah, Nilay D.
AU - Ngufor, Che
N1 - Funding Information:
Administrative, technical, or material support: Abraham, Shah. Supervision: Noseworthy, Ngufor. Conflict of Interest Disclosures: Dr Herrin reported receiving research funding from the National Cancer Institute, the Agency for Healthcare Research and Quality (AHRQ), the Patient Centered Outcomes Research Institute, and the Centers for Medicare & Medicaid Services. Dr Noseworthy reported receiving research funding from the National Heart, Lung, and Blood Institute (NHLBI), National Institutes of Health (NIH); the National Institute on Aging, NIH; the AHRQ; the US Food and Drug Administration (FDA); and the American Heart Association; having a potential equity and royalty relationship with AliveCor; and serving as a study investigator in an ablation trial sponsored by Medtronic. Dr Shah reported receiving grants from the AHRQ during the conduct of the study and receiving research support from the FDA; the Center for Medicare and Medicaid Innovation; the AHRQ; the NHLBI, NIH; the National Science Foundation; the Medical Device Innovation Consortium; and the Patient Centered Outcomes Research Institute. Dr Ngufor reported receiving grants from the NIH during the conduct of the study. No other disclosures were reported.
Funding Information:
Funding/Support: This research was funded by grant R01HS025402 (Dr Abraham) from the Agency for Healthcare Research and Quality.
Publisher Copyright:
© 2021 American Medical Association. All rights reserved.
PY - 2021/5/21
Y1 - 2021/5/21
N2 - Importance: Anticipating the risk of gastrointestinal bleeding (GIB) when initiating antithrombotic treatment (oral antiplatelets or anticoagulants) is limited by existing risk prediction models. Machine learning algorithms may result in superior predictive models to aid in clinical decision-making. Objective: To compare the performance of 3 machine learning approaches with the commonly used HAS-BLED (hypertension, abnormal kidney and liver function, stroke, bleeding, labile international normalized ratio, older age, and drug or alcohol use) risk score in predicting antithrombotic-related GIB. Design, Setting, and Participants: This retrospective cross-sectional study used data from the OptumLabs Data Warehouse, which contains medical and pharmacy claims on privately insured patients and Medicare Advantage enrollees in the US. The study cohort included patients 18 years or older with a history of atrial fibrillation, ischemic heart disease, or venous thromboembolism who were prescribed oral anticoagulant and/or thienopyridine antiplatelet agents between January 1, 2016, and December 31, 2019. Exposures: A cohort of patients prescribed oral anticoagulant and thienopyridine antiplatelet agents was divided into development and validation cohorts based on date of index prescription. The development cohort was used to train 3 machine learning models to predict GIB at 6 and 12 months: regularized Cox proportional hazards regression (RegCox), random survival forests (RSF), and extreme gradient boosting (XGBoost). Main Outcomes and Measures: The performance of the models for predicting GIB in the validation cohort, evaluated using the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, positive predictive value, and prediction density plots. Relative importance scores were used to identify the variables that were most influential in the top-performing machine learning model. Results: In the entire study cohort of 306463 patients, 166177 (54.2%) were male, 193648 (63.2%) were White, the mean (SD) age was 69.0 (12.6) years, and 12322 (4.0%) had experienced a GIB. In the validation data set, the HAS-BLED model had an AUC of 0.60 for predicting GIB at 6 months and 0.59 at 12 months. The RegCox model performed the best in the validation set, with an AUC of 0.67 at 6 months and 0.66 at 12 months. XGBoost was similar, with AUCs of 0.67 at 6 months and 0.66 at 12 months, whereas for RSF, AUCs were 0.62 at 6 months and 0.60 at 12 months. The variables with the highest importance scores in the RegCox model were prior GI bleed (importance score, 0.72); atrial fibrillation, ischemic heart disease, and venous thromboembolism combined (importance score, 0.38); and use of gastroprotective agents (importance score, 0.32). Conclusions and Relevance: In this cross-sectional study, the machine learning models examined showed similar performance in identifying patients at high risk for GIB after being prescribed antithrombotic agents. Two models (RegCox and XGBoost) performed modestly better than the HAS-BLED score. A prospective evaluation of the RegCox model compared with HAS-BLED may provide a better understanding of the clinical impact of improved performance.
AB - Importance: Anticipating the risk of gastrointestinal bleeding (GIB) when initiating antithrombotic treatment (oral antiplatelets or anticoagulants) is limited by existing risk prediction models. Machine learning algorithms may result in superior predictive models to aid in clinical decision-making. Objective: To compare the performance of 3 machine learning approaches with the commonly used HAS-BLED (hypertension, abnormal kidney and liver function, stroke, bleeding, labile international normalized ratio, older age, and drug or alcohol use) risk score in predicting antithrombotic-related GIB. Design, Setting, and Participants: This retrospective cross-sectional study used data from the OptumLabs Data Warehouse, which contains medical and pharmacy claims on privately insured patients and Medicare Advantage enrollees in the US. The study cohort included patients 18 years or older with a history of atrial fibrillation, ischemic heart disease, or venous thromboembolism who were prescribed oral anticoagulant and/or thienopyridine antiplatelet agents between January 1, 2016, and December 31, 2019. Exposures: A cohort of patients prescribed oral anticoagulant and thienopyridine antiplatelet agents was divided into development and validation cohorts based on date of index prescription. The development cohort was used to train 3 machine learning models to predict GIB at 6 and 12 months: regularized Cox proportional hazards regression (RegCox), random survival forests (RSF), and extreme gradient boosting (XGBoost). Main Outcomes and Measures: The performance of the models for predicting GIB in the validation cohort, evaluated using the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, positive predictive value, and prediction density plots. Relative importance scores were used to identify the variables that were most influential in the top-performing machine learning model. Results: In the entire study cohort of 306463 patients, 166177 (54.2%) were male, 193648 (63.2%) were White, the mean (SD) age was 69.0 (12.6) years, and 12322 (4.0%) had experienced a GIB. In the validation data set, the HAS-BLED model had an AUC of 0.60 for predicting GIB at 6 months and 0.59 at 12 months. The RegCox model performed the best in the validation set, with an AUC of 0.67 at 6 months and 0.66 at 12 months. XGBoost was similar, with AUCs of 0.67 at 6 months and 0.66 at 12 months, whereas for RSF, AUCs were 0.62 at 6 months and 0.60 at 12 months. The variables with the highest importance scores in the RegCox model were prior GI bleed (importance score, 0.72); atrial fibrillation, ischemic heart disease, and venous thromboembolism combined (importance score, 0.38); and use of gastroprotective agents (importance score, 0.32). Conclusions and Relevance: In this cross-sectional study, the machine learning models examined showed similar performance in identifying patients at high risk for GIB after being prescribed antithrombotic agents. Two models (RegCox and XGBoost) performed modestly better than the HAS-BLED score. A prospective evaluation of the RegCox model compared with HAS-BLED may provide a better understanding of the clinical impact of improved performance.
UR - http://www.scopus.com/inward/record.url?scp=85107008021&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85107008021&partnerID=8YFLogxK
U2 - 10.1001/jamanetworkopen.2021.10703
DO - 10.1001/jamanetworkopen.2021.10703
M3 - Article
C2 - 34019087
AN - SCOPUS:85107008021
SN - 2574-3805
VL - 4
JO - JAMA network open
JF - JAMA network open
IS - 5
M1 - 10703
ER -