JMIR Preprints #15431: Ensemble learning models based on non-invasive features for type 2 diabetes screening: a case-control study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Ensemble learning models based on non-invasive features for type 2 diabetes screening: a case-control study

Tianzhou Yang;
Li Zhang;
Liwei Yi;
Huawei Feng;
Shimeng Li;
Haoyu Chen;
Junfeng Zhu;
Jian Zhao;
Yingyue Zeng;
Hongsheng Liu

ABSTRACT

Background:

Early diabetes screening could effectively reduce the burden of disease. However, a large number of resources are necessary for natural population-based screening projects. In this paper, diabetes prediction models are built for screening in a non-invasive and low-cost manner based on the ensemble learning method.

Objective:

The dataset for building and evaluating the diabetes prediction model was extracted from the National Health and Nutrition Examination Survey (NHANES 2011-2016). After data cleaning and feature selection, the dataset was split into a training set (80%, 2011-2014), test set (20%, 2011-2014) and validation set (2015-2016).

Methods:

Three simple machine learning methods (linear discriminant analysis, support vector machine, and random forest) and the easy ensemble method were used to build diabetes prediction models. Model performance was evaluated through 5-fold cross-validation and external validation. Delong’s test (two-sided) was used to test the performance differences between the models.

Results:

There were 8057 observations and 12 attributes selected from the database. In the 5-fold cross-validation, the three simple methods yielded high predictive performance models with areas under the curve (AUCs) over 0.800, wherein the ensemble models significantly outperformed the simple models. When evaluating the models in the test set and validation set, the same trends were also observed. The ensemble model of linear discriminant analysis yielded the best performance with an AUC of 0.849, an accuracy of 0.730, a sensitivity of 0.819, and a specificity of 0.709 in the validation set.

Conclusions:

The study indicated that efficient screening using machine learning methods with non-invasive tests could be applied to a large population and achieve the secondary prevention objective. Clinical Trial: Null

Citation

Please cite as:

Yang T, Zhang L, Yi L, Feng H, Li S, Chen H, Zhu J, Zhao J, Zeng Y, Liu H

Ensemble Learning Models Based on Noninvasive Features for Type 2 Diabetes Screening: Model Development and Validation

JMIR Med Inform 2020;8(6):e15431

DOI: 10.2196/15431

PMID: 32554386

PMCID: 7333074

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Jul 10, 2019

Date Accepted: Feb 7, 2020

Ensemble learning models based on non-invasive features for type 2 diabetes screening: a case-control study

ABSTRACT

Citation

Copyright