JMIR Preprints #78954: Applications and methods to develop artificial intelligence-based population-specific risk models for predicting first and recurrent cardio/cerebrovascular events: PowerAI-CVD Showcase

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Applications and methods to develop artificial intelligence-based population-specific risk models for predicting first and recurrent cardio/cerebrovascular events: PowerAI-CVD Showcase

Haipeng Liu;
Jeremy Man Ho Hui;
Anselm Au;
Quincy Lee;
Carlin Chang;
Mehrdad Shahmohammadi Beni;
Gary Tse

ABSTRACT

Background:

Our team was the first in Hong Kong to develop machine learning-enhanced risk models for predicting first and recurrent events of cardiovascular disease in predominantly Chinese subjects using territory-wide data from our specific geographical region. Initially >500 risk variables from demographics (age, sex, source of admissions, ethnicity, number of hospitalisations prior to the index date), physiological status (systolic blood pressure [SBP], diastolic blood pressure [DBP], mean blood pressure [MBP], variability of SBP, DBP and MBP), disease diagnoses from 18 systems/organs, laboratory test results (complete blood count, liver and renal function, lipids, glycemic tests), and medications (23 categories) were considered. The PowerAI-CVD model is a simpler model with 19 variables, requiring less computational power but nevertheless exhibiting high discriminative power with a c-statistic of 0.89.

Objective:

Arising from this project was a series of graphical user interface (GUI)-based applications and tools that can be used for longitudinal analysis of routinely collected electronic health records from Hong Kong, which we termed Open-source disease analyzer toolkit (ODAT).

Methods:

ODAT was developed using Python. It is publicly available from this URL: https://odat.info/ and released under GNU GPLv3 on Github (https://github.com/ODAT-Project), which is fully free and open-source for research or commercial use.

Results:

ODAT contains three chapters. Chapter 1: data cleaning, processing and dataset creation. Chapter 2: automating data analysis and risk modelling using traditional Cox and machine learning method (XGBoost, Gradient Boosting, Multilayer Perceptron, Random Forest, Naïve Bayes, Decision Tree, k-Nearest Neighbor, AdaBoost, and SVM-Sigmoid model). Using the top performing machine learning model as a showcase (XGBoost), nonlinear terms can be fed into traditional Cox regression models to enhance risk prediction. Chapter 3: graphical outputs of risk outputs over a 1, 3, 5, 10 and 20-year period, and interactive platforms to illustrate how the risk estimates alter after selecting and deselecting treatment options.

Conclusions:

Our tools enable epidemiologists, public health practitioners and researchers to develop risk models with friendly GUIs, starting from database building, to variable selection, and model building.

Citation

Please cite as:

Liu H, Hui JMH, Au A, Lee Q, Chang C, Beni MS, Tse G

Applications and methods to develop artificial intelligence-based population-specific risk models for predicting first and recurrent cardio/cerebrovascular events: PowerAI-CVD Showcase

JMIR Preprints. 13/06/2025:78954

DOI: 10.2196/preprints.78954

URL: https://preprints.jmir.org/preprint/78954

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Currently submitted to: JMIR Preprints

Date Submitted: Jun 13, 2025

Applications and methods to develop artificial intelligence-based population-specific risk models for predicting first and recurrent cardio/cerebrovascular events: PowerAI-CVD Showcase

ABSTRACT

Citation

Copyright