Prediction of normal echocardiograms from 12-lead electrocardiograms using deep learning to improve outpatient screening for structural heart disease
European Heart Journal - Digital Health

Abstract
Echocardiography is indispensable for diagnosing structural heart disease (SHD). However, over 40% of initial outpatient imaging studies show no clinically significant findings, contributing to unnecessary utilization and capacity limitations in cardiology clinics.
To develop and validate an artificial intelligence-based electrocardiogram interpretation (AI-ECG) model that can accurately identify patients unlikely to have clinically significant SHD, enabling more efficient use of echocardiography.
We retrospectively paired ECGs with transthoracic echocardiograms (performed within 90 days) from two Dutch hospitals (2009-2023). We built an ensemble model in two stages. First, a convolutional neural network analyzed the median beat ECG and produced nine probability scores, one for each predefined abnormality: moderate-or-greater valvular disease, left or right ventricular systolic dysfunction or dilation, and grade ≥II diastolic dysfunction. These probabilities, along with patient age and sex, were then input into an XGBoost classifier, which outputs a single probability that the echocardiogram would be abnormal (any of the nine abnormalities). The operating point was fixed at 95% sensitivity in the development set to maximize negative predictive value, and model performance was evaluated in an outpatient test cohort.
The development sets comprised 80,635 ECGs from 52,006 patients (31.5% SHD) and 4,024 ECGs from 4,024 patients (20.3% SHD) were included in the outpatient test cohort. Figure 1 shows the ROC curve (AUROC 0.84, 95% CI 0.83–0.86). At the prespecified threshold, sensitivity was 0.95 (95% CI: 0.93–0.96), specificity 0.35 (95% CI: 0.34–0.37), PPV 0.27 (95% CI: 0.25–0.29) and NPV 0.96 (95% CI: 0.95–0.98). The model would have deferred 1,130 (28.1%) echocardiograms while missing 41 (1.0%) significant SHD cases, none classified as severe.
An AI-ECG screening model can effectively rule out clinically significant SHD with high sensitivity and NPV. In our retrospective analysis, it could potentially reduce the volume of outpatient echocardiograms by nearly 30%, improving efficiency. Prospective trials are warranted to validate these findings and to assess the safety, clinical integration workflow, and cost-effectiveness of implementing the model in practice. Discriminative performance of AI-ECG
Contributors

W A C Van Amsterdam
Author

D Van Osch
Author

T P Mast
Author

M B Vessies
Author

W A L Tonino
Author

M Van 'T Veer
Author

A J Teske
Author

P Van Der Harst
Author

R R Van De Leur
Author

R Van Es
Author
