Self-supervised learning can tackle life-threatening arrhythmia detection data challenges

European Heart Journal - Digital Health

12 January 2026
Organised by: Logo
ESC Journals

Abstract

AbstractBackground

Life-threatening arrhythmias (LTAs), such as ventricular tachycardia and ventricular fibrillation, are critical cardiac events that require immediate intervention to prevent sudden cardiac death. Continuous monitoring systems and accurate LTA detection from electrocardiogram (ECG) signals could allow timely intervention. However, two key challenges hinder the development of robust LTA detection models: the low data availability and the significant class imbalance, due to the rarity and difficulty of collecting annotated LTA events.

Recently, Foundation Models (FMs) emerged as powerful large-scale deep learning models trained on huge datasets and easily adaptable to different downstream tasks. They usually exploit self-supervised learning (SSL) approaches to learn from unlabeled datasets, saving time and resources for data labeling.

Purpose

In this work, we aim to overcome the low data availability and class imbalance challenges with a novel SSL-based FM, pre-trained on a large set of unlabeled ECGs and fine-tuned on limited LTA data for LTA detection. The ultimate goal is to develop a reliable detection system that, combined with wearable devices, can be used for continuous monitoring and immediate alert to the local emergency service.

Methods

We pre-trained our model on 406’117 ECGs from open-access datasets (without LTA events), with an SSL approach. We then fine-tuned the model on the target dataset (82 recordings including LTA events).

We tested different models to explore two batch sampling strategies (uniform and stratified) and four loss functions (binary cross-entropy and its weighted, focal, and weighted-focal variants). Class-weighted loss and uniform sampling aim to enhance the detection of the underrepresented LTA class, while focal loss focuses on hard-to-classify cases. The SSL approach, instead, aims to address the data scarcity issue by leveraging the knowledge extracted from the large pretraining dataset.

Finally, we compared the SSL approach with the traditional supervised training of the model directly on the target LTA dataset.

Results

The best SSL model achieved 92.21% macro F1 (MF1), 97.35% sensitivity (Se), and 97.36% specificity (Sp) with the baseline loss and stratified batch sampling. The best supervised model, instead, resulted in 89.55% MF1, 95.5% Se, and 96.4% Sp, exploiting weighted-focal loss and uniform batch sampling. Overall, the SSL approach (average across all fine-tuned models: 97.38 ± 1.44 Se, 96.21 ± 0.62 Sp, 89.67 ± 1.59 MF1) outperformed the fully supervised one (93.37 ± 2.43 Se, 93.28 ± 2.02 Sp, 83.17 ± 3.83 MF1), achieving high performance regardless of balancing strategy.

Conclusion

The SSL model proved superior performance in LTA detection, and robustness to data scarcity and class imbalance without the need for specific balancing strategies, paving the way to reliable real-time LTA monitoring systems for life-saving timely intervention.

Performance comparison

Contributors

B Zanchi
B Zanchi

Author

University of Lugano Lugano , Switzerland

G Conte
G Conte

Author

F D Faraci
F D Faraci

Author

Institute of Digital Technologies for Personalized Healthcare Viganello , Switzerland

ESC 365 is supported by