Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Nov 19, 2019
Date Accepted: Mar 5, 2020
Predicting Metabolic Syndrome: Machine Learning Models Using Decision Trees Algorithm
ABSTRACT
Background:
Metabolic syndrome (MetS) is a cluster of disorders that significantly influence the development and deterioration of numerous diseases. Although several clinically significant factors leading to MetS are known, the rank order of their relevance, as well as of their importance, is unclear. FibroScan is using ultrasound-based elastography, the controlled attenuation parameter (CAP) score provided to clinicians to suspect patients with fatty liver or not. The use of artificial intelligence in health care, particularly machine learning methods, provides an opportunity to discover underlying patterns and correlations through the learning of data-driven prediction models.
Objective:
The aim of this study is using machine learning models combined with plenty of electric health records and the CAP score measured by Fibroscan, which is a noninvasive, safe, and rapid device that assesses the hardness of the liver using ultrasound-based elastography. Its portability makes it valuable for bedside inspection in hospitalized patients and population outreach screening. We conduct various statistical learning techniques to visualize and investigate the ranking and importance of risk factors leading to MetS and to identify potential variables.
Methods:
Hypothesis testing and multivariable logistic regression were conducted for every risk factor of MetS. Principle component analysis was used to visualize the distribution of MetS patients after rotation and dimension reduction. Because artificial intelligence has been used in health care, machine learning methods, in particular, have provided a meaningful opportunity to discover underlying patterns and correlations through the learning of data-driven prediction models. We applied various statistical learning techniques to visualize and investigate the pattern and relationship between MetS and several potential variables.
Results:
A total of 1,333 relatively healthy participants were enrolled in this study. Obesity, serum glutamic-oxalocetic transaminase, serum glutamic pyruvic transaminase (γ-GT), CAP score, and glycated hemoglobin (HbA1c) were found to be significant risk factors in multivariable logistic regression. Among these significant variables, the CAP score was as important as obesity in classification of MetS with approximately 290–300 dB/m as a threshold, implying that FibroScan can provide a convenient and rapid test for MetS diagnosis in patients, even though the disease is complicated and progressive. In addition, HbA1c was more reliable and precise for predicting MetS than fasting plasma glucose, with approximately 5.9 as a threshold in decision trees. Liver-related indices, such as asγ-GT, serum glutamic pyruvic transaminase, and liver stiffness score (E score), were also considered important variables in random forest models. The AUC statistics for CRAT and random forest, as obtained by the receiver operating characteristic curve, were 0.831 and 0.904, respectively. Machine learning models combined CAP score and other parameters to analyze important determinants of MetS and then established a prediction model, suggesting a more objective and accurate performance than is possible with traditional analytic models.
Conclusions:
Machine learning technology facilitates the identification of prevalent risk factors for MetS, enabling the rate of MetS to be further reduced. Clinical Trial: TMU-Joint Institutional Review Board TMU-JIRB No.: N201903080 http://ohr.tmu.edu.tw/2dt2/super_pages.php?ID=2page202
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.