A post processing pipeline to prepare raw data for machine learning algorithms in cardiac magnetic resonance imaging
European Heart Journal - Cardiovascular Imaging

Abstract
Type of funding sources: None.
Artificial Intelligence is an emergent tool in clinical practice for post processing of medical images. Machine Learning (ML) pipelines are created for data of interest extraction and algorithm application. A common issue in data extraction is represented by noisy datasets, like those of CMR studies, characterized by multiple images, acquired by different techniques, axis orientation and contrast timing.
A ML pipeline for extraction of LGE images from raw DICOM data is presented. Additionally, steps for normalization of image number and automatically heart localization are outlined.
642 consecutive CMR studies were analyzed.
Pipeline, Part 1. By looking at the metadata in raw files, ‘SequenceName’ tag was used to discard cine images, ‘ScanningSequence’ tag to select Gradient Recall and Inversion Recovery techniques (Inversion Time > 100 ms), ‘SequenceVariant’ tag to discard Steady State images (See
Pipeline, Part 2. Given a desired final number of slices and resolution, the 3D-array was reshaped through a spline interpolation. In order to have a focus on the heart, a Region of Interest (ROI) extractor was implemented, based on a YOLO network for object detection. The network was applied on all the slices (
At the end of the ML pipeline, images can be reduced to a common resolution and forwarded to ML algorithms.
By using this pipeline, a great amount of information not needed for LGE analysis can be discarded, granting a significant reduction in terms of data storage. In our series, the original dataset extended for about 200 GB; by requesting 10 slices per subject with a resolution of 128 by 128 pixels (also extracting heart ROI) the final dimension was reduced to 108 MB.
In this work, we presented a post-processing pipeline for CRM images and LGE analysis. Given in input raw CRM the pipeline is able to (i) remove unuseful data, (ii) extract heart ROIs storing also Dicom metadata, (iii) normalize slices and image resolution, and (iv) store the processed CRM into ready-format for ML techniques. Pipeline Schematic Representation YOLO Heart Extraction


