Polypharmacy and multimorbidity metrics for predicting ischaemic stroke using machine learning: The Hong Kong Diabetes Study
EP Europace Journal

Abstract
Diabetes mellitus is associated with a two- to five-fold increase in the risk of ischaemic stroke. The purpose of this study is to study whether identify individuals at risk of stroke among the diabetic population using routinely collected health records from Hong Kong, China.
This retrospective cohort study focused on patients with type 2 diabetes mellitus (T2DM) who received care at public hospitals or their affiliated ambulatory/outpatient facilities in Hong Kong from January 1 to December 31, 2009. Patients’ demographic details, baseline comorbidities, and use of antidiabetic agents and cardiovascular medications were examined. A predictive model based on logistic regression was developed using a training cohort, with additional exploration of machine learning techniques. Both linear and nonlinear algorithms were considered, incorporating stratified ten-fold cross-validation to identify risk factors for stroke prediction in patients with T2DM. Feature selection and the Random Over-Sampling Examples (ROSE) were applied to enhance learning efficiency. Model metrics were calculated to evaluate the classifier’s performance.
This cohort comprised 273,876 patients with T2DM (mean age: 65.4 years; 48.2% male). Among them, 8,986 (3.28%) patients experienced a stroke during follow-up. Five selected machine learning models demonstrated strong discriminatory power and predictive accuracy for identifying stroke risk. The multivariable logistic regression model was the best performer, achieving 78.2% accuracy (AUC: 85.6%, sensitivity: 79.3%, specificity: 77.0%) and 78.1% recall for baseline modeling. Model performance could further improve to 97.2% accuracy (AUC: 98.6%, sensitivity: 95.0%, specificity: 99.4%) and 99.6% recall by refining the grouping of polypharmacy and multimorbidity. These two features, along with patients’ age, use of antidiabetic and cardiovascular medications, as well as comorbid hypertension, atrial fibrillation, coronary heart disease, heart failure and intracranial haemorrhage, were significant predictors of stroke risk.
Patients aged 65 or above, using sulphonylurea drugs and lipid-lowering agents, and having comorbid hypertension, showed an increased stroke risk in the diabetes population. High levels of polypharmacy and multimorbidity further elevated this risk significantly within the cohort. This study provides a rapid tool for stroke risk assessment tailored to patients with T2DM using routinely collected health records. It aims to support healthcare professionals and frontline practitioners in the early detection and timely management of high-risk groups.

