Phenogrouping HFpEF trajectories identifies early and end-stage subtypes: a multicentre study using natural language processing

European Heart Journal

5 November 2025
Organised by: Logo
ESC Journals

Abstract

AbstractBackground

Heart failure with preserved ejection fraction (HFpEF) is a heterogeneous and frequently underdiagnosed syndrome, complicating timely intervention. Previous clustering studies were predominantly trial-based, limiting real-world applicability. Characterising HFpEF phenogroups in routine clinical practice could improve diagnosis and support targeted therapies.

Purpose

This study aimed to characterise HFpEF phenogroups using cluster analysis of real-world electronic health record (EHR) data and investigate their clinical trajectories and outcomes.

Methods

We conducted a retrospective cohort study using routinely collected EHR data from two UK centres. HFpEF patients were identified per European Society of Cardiology (ESC) criteria using a validated natural language processing pipeline. Unsupervised clustering was performed using latent class analysis (LCA) applied to ten clinical features. External validation was conducted in an independent cohort. Longitudinal transitions between phenogroups and all-cause mortality were analysed.

Results

Among 2,223 patients (median age 75, 60% female), 89.6% met ESC criteria but without a clinician-assigned diagnosis. LCA identified four phenogroups: (1) Elderly-Atrial Dysfunction (oldest, atrial fibrillation, significant diastolic dysfunction) [N=703 (32%)]; (2) Cardio-Renal-Metabolic (high burden of diabetes, kidney disease, and cardiac remodelling) [N=530 (24%)]; (3) Obesity-Predominant (younger patients with significant obesity and lower NT-proBNP (N-terminal pro-B-type natriuretic peptide) levels) [N=530 (32%)]; and (4) Young-Low Comorbidity (minimal traditional risk factors, mild cardiac dysfunction and the lowest likelihood of clinician-assigned diagnosis) [N=487 (22%)]. External validation (N=3,349) confirmed phenogroup reproducibility. The estimated five-year mortality for the Young-Low Comorbidity phenogroup was 26%. Compared to this group, the Cardio-Renal-Metabolic and Elderly-Atrial Dysfunction phenogroups had significantly higher adjusted mortality risks (HR: 1.49; 95% CI: 1.18-1.87; P < 0.03 and HR: 1.30; 95% CI: 1.02-1.67; P < 0.001, respectively). Over a median 3.97-year follow-up, 53% of Young-Low Comorbidity patients progressed to higher-risk phenogroups.

Conclusions

AI-driven phenotyping identified four clinically distinct and prognostically relevant HFpEF phenogroups. Our findings highlight the high underdiagnosis rate and rapid progression in early-stage HFpEF, reinforcing the need for improved recognition and early intervention. Targeted therapies may hold particular promise for this group, warranting further investigation.

Phenogroup trajectories

Contributors

S Brown
S Brown

Author

King's College Hospital NHS Foundation Trust London , United Kingdom of Great Britain & Northern Ireland

J Wu
J Wu

Author

King's College London London , United Kingdom of Great Britain & Northern Ireland

M Rizvi
M Rizvi

Author

T Searle
T Searle

Author

D Bromage
D Bromage

Author

King's College London London , United Kingdom of Great Britain & Northern Ireland

A M Shah
A M Shah

Author

King's College London London , United Kingdom of Great Britain & Northern Ireland

C Miller
C Miller

Author

University of Manchester Manchester , United Kingdom of Great Britain & Northern Ireland

D Biswas
D Biswas

Author

ESC 365 is supported by