Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Sep 15, 2023
Date Accepted: Sep 17, 2024

The final, peer-reviewed published version of this preprint can be found here:

Machine Learning–Based Prediction for Incident Hypertension Based on Regular Health Checkup Data: Derivation and Validation in 2 Independent Nationwide Cohorts in South Korea and Japan

Hwang SH, Lee H, Lee JH, Lee M, Koyanagi A, Smith L, Rhee SY, Yon DK, Lee J

Machine Learning–Based Prediction for Incident Hypertension Based on Regular Health Checkup Data: Derivation and Validation in 2 Independent Nationwide Cohorts in South Korea and Japan

J Med Internet Res 2024;26:e52794

DOI: 10.2196/52794

PMID: 39499554

PMCID: 11576616

Machine learning-based prediction for incident hypertension based on regular health checkup data: derivation and validation in two independent nationwide cohorts in South Korea and Japan

  • Seung Ha Hwang; 
  • Hayeon Lee; 
  • Jun Hyuk Lee; 
  • Myeongcheol Lee; 
  • Ai Koyanagi; 
  • Lee Smith; 
  • Sang Youl Rhee; 
  • Dong Keon Yon; 
  • Jinseok Lee

ABSTRACT

Background:

Globally, cardiovascular diseases (CVDs) are the primary cause of death, with hypertension as a key contributor. In 2019, CVD led to 17.9 million deaths, predicted to reach 23 million by 2030.

Objective:

This study presents a new method to predict hypertension using demographic data, employing six machine learning models for enhanced reliability and applicability. The goal is to harness AI for early and accurate hypertension diagnosis across diverse populations.

Methods:

Data from two national cohort studies, NHIS-NSC (South Korea, n=244,814), conducted between 2002 and 2013 were utilized to train and test machine learning models designed to anticipate incident hypertension within five years of a health checkup involving ≥20 years of age, and JMDC (Japan, n=1,296,649) were utilized for extra validation. An ensemble from six diverse machine learning models was employed to identify the five most salient features contributing to hypertension by presenting a feature importance analysis to confirm the contribution of each future.

Results:

The AdaBoost and logistic regression ensemble showed superior balanced accuracy (0.812; sensitivity, 0.806; specificity, 0.818; area under the receiver operating characteristic curve [AUROC], 0.901). The five key hypertension indicators were age, diastolic blood pressure, body mass index, systolic blood pressure, and fasting blood glucose. The JMDC dataset (extra-validation set) corroborated these findings (balanced accuracy, 0.741; AUROC, 0.824). The ensemble model was integrated into a public web portal (http://ai-wm.khu.ac.kr/Hypertension/) for predicting hypertension onset based on health checkup data.

Conclusions:

Comparative evaluation of our machine learning models against classical statistical models across two distinct studies emphasized the former's enhanced stability, generalizability, and reproducibility in predicting hypertension onset.


 Citation

Please cite as:

Hwang SH, Lee H, Lee JH, Lee M, Koyanagi A, Smith L, Rhee SY, Yon DK, Lee J

Machine Learning–Based Prediction for Incident Hypertension Based on Regular Health Checkup Data: Derivation and Validation in 2 Independent Nationwide Cohorts in South Korea and Japan

J Med Internet Res 2024;26:e52794

DOI: 10.2196/52794

PMID: 39499554

PMCID: 11576616

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.