Machine learning does not improve upon traditional regression in predicting outcomes in atrial fibrillation: an analysis of the ORBIT-AF and GARFIELD-AF registries
EP Europace Journal

Abstract
Prediction models for outcomes in atrial fibrillation (AF) are used to guide treatment. While regression models have been the analytic standard for prediction modelling, machine learning (ML) has been promoted as a potentially superior methodology. We compared the performance of ML and regression models in predicting outcomes in AF patients.
The Outcomes Registry for Better Informed Treatment of Atrial Fibrillation (ORBIT-AF) and Global Anticoagulant Registry in the FIELD (GARFIELD-AF) are population-based registries that include 74 792 AF patients. Models were generated from potential predictors using stepwise logistic regression (STEP), random forests (RF), gradient boosting (GB), and two neural networks (NNs). Discriminatory power was highest for death [STEP area under the curve (AUC) = 0.80 in ORBIT-AF, 0.75 in GARFIELD-AF] and lowest for stroke in all models (STEP AUC = 0.67 in ORBIT-AF, 0.66 in GARFIELD-AF). The discriminatory power of the ML models was similar or lower than the STEP models for most outcomes. The GB model had a higher AUC than STEP for death in GARFIELD-AF (0.76 vs. 0.75), but only nominally, and both performed similarly in ORBIT-AF. The multilayer NN had the lowest discriminatory power for all outcomes. The calibration of the STEP modelswere more aligned with the observed events for all outcomes. In the cross-registry models, the discriminatory power of the ML models was similar or lower than the STEP for most cases.
When developed from two large, community-based AF registries, ML techniques did not improve prediction modelling of death, major bleeding, or stroke.
Contributors

Zak Loring
Author

Suchit Mehrotra
Author

Jonathan P Piccini
Author

John Camm
Author

David Carlson
Author

Gregg C Fonarow
Author

Keith A A Fox
Author

Eric D Peterson
Author

Karen Pieper
Author

Ajay K Kakkar
Author
