JMIR Preprints #64979: Interpretable Machine Learning Model for Predicting and Risk Assessment of Diabetic Nephropathy

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Interpretable Machine Learning Model for Predicting and Risk Assessment of Diabetic Nephropathy

Yili Wen;
Zhiqiang Wan;
Huiling Ren;
Xu Wang;
Weijie Wang

ABSTRACT

Introduction: Diabetic Nephropathy (DN), a severe complication of diabetes, is characterized by proteinuria, hypertension, and progressive renal function decline, potentially leading to end-stage renal disease. The International Diabetes Federation projects that by 2045, 783 million people will have diabetes, with 30%-40% of them developing DN. Current diagnostic approaches lack sufficient sensitivity and specificity for early detection and diagnosis, underscoring the need for an accurate, interpretable predictive model to enable timely intervention, reduce cardiovascular risks, and optimize healthcare costs.

Methods:

Our retrospective cohort study investigated 1,000 type-2 diabetes patients using data from electronic medical records collected between 2015 and 2020. The study design incorporated a sample of 444 patients with diabetic nephropathy and 556 without, focusing on demographics, clinical metrics such as blood pressure and glucose levels, and renal function markers. Data collection relied on electronic records, with missing values handled via multiple imputation and dataset balance achieved using SMOTE. In this study, advanced machine learning algorithms, namly XGBoost, CatBoost, and LightGBM, were utilized due to their robustness in handling complex datasets. Key metrics, including accuracy, precision, recall, F1 score, specificity, and area under the curve (AUC), were employed to provide a comprehensive assessment of model performance. Additionally, Explainable Machine Learning (XML) techniques, such as LIME and SHAP, were applied to enhance the transparency and interpretability of the models, offering valuable insights into their decision-making processes.

Results:

XGBoost and LightGBM demonstrated superior performance, with XGBoost achieving the highest accuracy of 86.87%, a precision of 88.90%, a recall of 84.40%, an f1 score of 86.44%, and a specificity of 89.12%. LIME and SHAP analyses provided insights into the contribution of individual features to elucidate the decision-making processes of these models, identifying serum creatinine, albumin, and lipoproteins as significant predictors. Conclusion: The developed machine learning model not only provides a robust predictive tool for early diagnosis and risk assessment of DN but also ensures transparency and interpretability, crucial for clinical integration. By enabling early intervention and personalized treatment strategies, this model has the potential to improve patient outcomes and optimize healthcare resource utilization.

Citation

Please cite as:

Wen Y, Wan Z, Ren H, Wang X, Wang W

Interpretable Machine Learning Model for Predicting and Assessing the Risk of Diabetic Nephropathy: Prediction Model Study

JMIR Med Inform 2025;13:e64979

DOI: 10.2196/64979

PMID: 41124652

PMCID: 12543291

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Aug 1, 2024

Open Peer Review Period: Aug 26, 2024 - Oct 21, 2024

Date Accepted: Sep 9, 2025

Date Submitted to PubMed: Sep 13, 2025

(closed for review but you can still tweet)

Interpretable Machine Learning Model for Predicting and Risk Assessment of Diabetic Nephropathy

ABSTRACT

Citation

Copyright