Leveraging structural equation modeling and machine learning on large-scale metabolomic data to uncover atrial fibrillation risks

European Heart Journal - Digital Health

12 January 2026
Organised by: Logo
ESC Journals

Abstract

Abstract

Large-scale biomedical repositories present unique challenges for both machine learning (ML) and structural equation modelling (SEM) when identifying subtle risk factors and their complex interplay. In the context of fatty acid research, polyunsaturated fatty acids (PUFAs) have shown cardiovascular risk reduction in some trials but were linked to increased atrial fibrillation (AF) in others (e.g., REDUCE-IT). In contrast, STRENGTH suggested no significant association between Omega-3 levels and AF. We aimed to clarify these findings using both supervised ML and advanced SEM techniques.

We included 483,372 participants from the UK Biobank with nuclear magnetic resonance (NMR)-based metabolomic profiles, focusing on fatty acid metabolism and lipidomic markers. Tree-based ML in conjunction with explainable AI techniques, such as SHAP (SHapley Additive exPlanations), was employed to identify potential contributors to incident AF. Subsequent SEM integrated multiple observed variables—such as left atrial volume, AF incidence, hypertension status, and Omega-3 supplement intake—enabling a comprehensive representation of complex interrelations. SEM, in this context, extends conventional regression by simultaneously estimating multiple dependent relationships, helping to clarify mediation pathways among clinical and metabolic factors.

Explainable AI techniques identified key contributors to AF, most notably low cholesteryl esters relative to total lipids in intermediate-density and low-density lipoproteins, diminished cholesterol-to-total-lipid ratios in intermediate-density lipoproteins, and lower linoleic and Omega-6 fatty acids. Omega-3 supplement use, although correlated with increased Omega-3 levels, did not substantially modify AF risk. SEM revealed no significant direct association between baseline Omega-3 levels and AF (coefficient: –0.0073, 95% CI –0.0979 to 0.0833, p=0.875). In contrast, Omega-6 levels displayed a moderate, protective association with AF (coefficient: –0.1698, 95% CI –0.2650 to –0.0746, p<0.001). Age (coefficient: 0.0783, 95% CI 0.0667 to 0.0899, p<0.001) and left atrial volume (coefficient: 0.6783, 95% CI 0.6269 to 0.7297, p<0.001) were strong risk factors for AF.

Our results emphasise the feasibility of combining machine learning, explainable AI methods, and SEM in large-scale cohorts. Consistent with STRENGTH, we found no significant direct link between Omega-3 levels and AF. However, lower Omega-6 levels were associated with increased AF risk, suggesting a potentially protective role for certain unsaturated fatty acids. By integrating supervised ML with SEM, researchers can reveal complex, multivariate relationships in large study cohorts, facilitating refined risk stratification. This framework is readily applicable to other cardiovascular conditions and broader disease domains, supporting diverse biomedical research questions.

Contributors

J Versnjak
J Versnjak

Author

Charite Campus Virchow Clinic Berlin , Germany

B Wild
B Wild

Author

T Kuehne
T Kuehne

Author

Charité - University Medicine Berlin Berlin , Germany

U Kintscher
U Kintscher

Author

Charite University Hospital Berlin , Germany

M Kelm
M Kelm

Author

Charité - University Medicine Berlin Berlin , Germany

ESC 365 is supported by