Contrastive learning to enrich ECG with cardiac MRI to predict structural features and cardiovascular disease
European Heart Journal - Digital Health

Abstract
Cardiovascular disease (CVD) remains the leading cause of mortality. Early detection of CVD requires diagnostic tools that are scalable, accessible, and low-cost. While cardiac magnetic resonance imaging (CMR) provides detailed structural and functional cardiac information, its limited availability and high costs restrict widespread use. In contrast, the electrocardiogram (ECG) is widely available but lacks the rich anatomical and mechanical information of the CMR. We hypothesize that ECG-based latent representations can be enriched with CMR features by leveraging contrastive learning (CL).
We aim to develop a CL framework that fuses CMR metrics to the ECG signal representation and use it to predict CMR features and CVD outcomes.
We used 63,448 subjects from the UK Biobank with same-day short-axis cine CMR and 12-lead resting ECG recordings. A pretrained segmentation model was used to crop the CMR images around the heart. As proposed by previous work, the model learns cross-modal latent representations by minimizing the distance between data from the same subject, and maximizing the distance between pairs from different participants, described in Figure 1. The model learns cross-modal latent representations by minimizing the distance between data from the same subject, and maximizing the distance between pairs from different participants, described in Figure 1. We further enhance the CMR representation with 3D cardiac volumes from ES and ED timepoints. We evaluated the ECG encoder's ability to predict: (1) CMR metrics, including left ventricular ejection fraction (LVEF), right ventricular ejection fraction (RVEF) and cardiac output; and (2) the most prevalent cardiovascular diseases, including coronary artery disease (CAD), atrial fibrillation (AF), sudden cardiac death (SCD), heart failure (HF), myocardial infarction (MI) and cardiomyopathy (CMP). We used 47,527/10,185/10,185 subjects for the training/validation/held-out test cohorts. We compared five models: (1) ECG only, (2) ECG with CL trained on one mid-ventricular end-diastolic slice from the CMR image, (3) ECG with CL trained on 2D+time data of one middle slice over time, (4) ECG with CL using 3D CMR data with multiple slices over the volume, and (5) ECG with triplet contrastive loss (TCL) with 4D CMR data combining the end-diastolic and end-systolic volumes over time.
The model with TCL showed the highest performance for the prediction of CMR features, with an average R2 of 0.605 (Figure 2). Adding temporal dynamics and 3D volume improved the model performance compared to using a single image. However, it did not improve the performance of the clinical endpoints.
We design a CL pre-training strategy that proves effective in enriching ECG representations with 3D volume CMR derived embeddings, enabling a low cost and non-invasive ECG-based risk stratification for cardiac pathology in the general population.
Contributors

L A F Alvarez-Florez
Author

F R Raijmakers
Author

J W Wiers
Author

S R C Ruiperez-Campillo
Author

M Z H K Kolk
Author

E J B Bekkers
Author

F V Y T Tjong
Author
