Deep learning-based automatic segmentation in pediatric cardiovascular imaging

European Heart Journal - Digital Health

12 January 2026
Organised by: Logo
ESC Journals

Abstract

AbstractBackground/Introduction

Automatic segmentation of cardiovascular structures from medical images is essential for the diagnosis and monitoring of congenital heart disease. Manual segmentation is a labour-intensive and time consuming process. In this case several Deep learning models, including the U-Net architecture, offer a better alternative for automatic and precise segmentation. Achieving high accuracy is crucial and challenging. Moreover it depends heavily on dataset quality, data processing strategies, and model architecture.

Purpose

This study set out to develop and evaluate a 3D U-Net model tailored for segmenting major cardiovascular structures in pediatric medical images. Our goal was to understand how different strategies—such as dataset composition (local, open-source, and combined), data augmentation, hyperparameter tuning, and architectural enhancements like attention mechanisms—affect segmentation performance.

Methods

We trained a 3D U-Net using three dataset configurations: a local clinical dataset (48 patients), and a combined dataset (85 patients). Training was conducted with a high-performance computing (HPC) system which used Tesla-V100 GPU. For assessing the impact of various optimization methods, we systematically experimented with and without data augmentation techniques. This included random affine and elastic transformations. Additionally, we performed hyperparameter tuning using Optuna and compared results without using the framework to serve as baselines. Furthermore, we compared the performance of a standard 3D U-Net architecture with that of a modified version incorporating Attention Gates, as well as without them. We assessed model performance using the Dice Similarity Coefficient (DSC) and Jaccard Index.

Results

We trained the U-Net model on the combined dataset of 85 patients. The predicted mask revealed improved performance differences based on the applied methodology. The U-Net model was configured for 1000 epochs. Various methods were implemented for improving performance of the models including data augmentation, Optuna hyper parameter tuning. The model predicted a mean Dice coefficient of 0.8330 and a Jaccard Index of 0.7356. In contrast, Attention Gates into the model resulted in a lower mean Dice Coefficient of 0.7774. Qualitative results (Figure 1) demonstrate the segmentation capabilities of models trained on the combined dataset when tested on unseen local and open-source patient data. A summary of these key performance metrics is presented in Table 1.

Conclusion(s)

On a combined, heterogeneous dataset, a standard 3D U-Net architecture without augmentation yielded the most consistent segmentation performance. Pre-processing techniques such as CLAHE remain valuable complementary methods for enhancing data quality.

Qualitative segmentation results of the

Consolidated Performance Metrics Across

Contributors

S Piskin
S Piskin

Author

Istinye University Istanbul , Turkiye

K B Kose
K B Kose

Author

I Yilmaz
I Yilmaz

Author

R Mirza
R Mirza

Author

F Aktay
F Aktay

Author

A Boray
A Boray

Author

Z Dulli
Z Dulli

Author

I Faress
I Faress

Author

ESC 365 is supported by