Explainable visual transformer based scoring of CAD-RADS from coronary CT angiography multiplanar projections

European Heart Journal - Digital Health

12 January 2026
Organised by: Logo
ESC Journals

Abstract

AbstractIntroduction

Accurate evaluation of the severity and extent of Coronary Artery Disease (CAD) is critical for managing cardiovascular risk. Coronary CT Angiography (CCTA) is the first-line non-invasive modality for this purpose. Yet, interpreting scans and assigning standardized CAD-RADS scores1 remains a resource-intensive, and operator-dependent task. This creates a demand for automated and transparent strategies to support routine screening.

Purpose

Building on a prior MaxViT AI model we developed for the purpose2, we present a novel web-based platform for CAD-RADS scoring from CCTA that requires no additional annotations beyond the final predicted risk label. The system is built to reflect real-world clinical decision-making, offering not only categorical risk prediction but also intuitive visual and textual explainability to facilitate user confidence and foster adoption.

Methods

We employ 2D representations of the left anterior descending (LAD), left circumflex (LCX), and right coronary arteries (RCA) extracted from standard CCTA. LAD, LCX and RCA views are combined into tri-channel composite images, including generative missing vessel imputation in case one of the three is missing (e.g. when one of the vessels is fully healthy). A Multi-Axis Vision Transformer (MaxViT) was fine-tuned to perform two tasks: (1) stratify patients into low(CAD-RADS 0), intermediate(1-3), or high(4+) CAD risk bins; and (2) identify cases warranting further invasive investigation (e.g. ICA). Additional outputs include heatmaps showing salient image regions driving AI prediction, employing DeepSHAP3, and automatically generated narrative summary explanations of findings, generated through similarity-based RAG employing radiologists written CCTA reports.

Results

The model was trained and tested on a cohort of 253 patients. In the three-class CAD risk prediction task, it achieved an AUC of 0.93 [95% CI: 0.84–0.99] and a classification accuracy of 0.88 [0.79–0.95]. For the binary decision task (follow-up investigation needed?), it showed an AUC of 0.87 [0.76–0.95]. Comparative benchmarks demonstrated superior performance against conventional CNN and attention-based baselines. The trained model is computationally lightweight and can be deployed on modest hardware, typically available for radiology clinical staff. A working prototype with an interactive dashboard was developed (see Figure 1) and tested, highlighting good usability for non-technical clinical users.

Conclusion

we present an automated, annotation-light CAD-RADS classification tool that mirrors physician workflows using standard CCTA imaging. Combining accuracy with visual and textual explainability, we address key limitations in current CAD screening: time burden, inter-rater variability, and transparency. The prototype shows good potential for integration in population-level screening programs without requiring computational resources currently scarcely available in healthcare organizations.

Contributors

E Parimbelli
E Parimbelli

Author

University of Pavia Pavia , Italy

M Chiesa
M Chiesa

Author

Monzino Cardiology Centre Milan , Italy

G Albi
G Albi

Author

ESC 365 is supported by