From clinic to couch: an uncertainty-aware deep learning approach for ECG analysis across modalities

European Heart Journal - Digital Health

12 January 2026

Organised by:

Abstract

AbstractIntroduction

The analysis of electrocardiograms (ECGs) using deep learning has made significant progress in recent years, whereby the variability in ECG data due to different recording techniques introduces a significant challenge. ECGs can be recorded in various modalities, including telemedical (tECG), resting (rECG), and long-term (hECG) ECGs, each with unique characteristics and applications. tECGs, commonly used in remote patient monitoring for conditions such as heart failure, are recorded by patients at home and use a limited number of non-standardized leads. rECGs are typically short recordings taken in clinical settings using standardized leads, while hECGs involve continuous recording over 24 hours or more. The resulting differences in data quality, duration, and clinical context can significantly impact model performance.

Purpose

We aim to develop cross-modal models with robust indicators of failure points when inferring on modalities underrepresented during training. We hypothesize that modality imbalance during training significantly affects the generalization ability of deep learning models, with model confidence as an indicator of domain shift.

Methods

We develop a deep learning model by training and evaluating on ECG data from three distinct modalities. Our approach involves using publicly available datasets for rECGs and proprietary datasets for tECGs and hECGs. We systematically vary the ratio of training data from each modality and analyze its effect on model performance and generalization. To assess model reliability under domain shift, we apply Monte Carlo (MC) Dropout to estimate predictive uncertainty. This allows us to quantify the confidence of the model across modalities.

Results

Our experiments show significant differences in model performance depending on the ECG modality we use for training and evaluation. When we train models predominantly on data from a single ECG modality, they often show reduced performance when applied to another, highlighting challenges in cross-modality generalization. For example, balancing the training data distribution further toward tECGs improved accuracy on a telemedical dataset (500 sinus rhythm and 500 atrial fibrillation ECGs) from 91.7% to 94.8%. MC Dropout consistently estimates increased model uncertainty associated with this domain shift.

Conclusion

Our study highlights that ECG modality is not only a technical detail but a key factor influencing the performance and reliability of deep learning based ECG analysis. The modality has a substantial impact on the model performance and generalizability. To address this, we propose an approach that shows high robustness across diverse ECG types. Our method recognizes modality-specific characteristics within ECG recordings and integrates uncertainty estimation to improve the performance stability and reliability of deep learning models in real-world ECG analysis.