Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Aug 1, 2024
Open Peer Review Period: Aug 26, 2024 - Oct 21, 2024
Date Accepted: Sep 9, 2025
Date Submitted to PubMed: Sep 13, 2025
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Interpretable Machine Learning Model for Predicting and Assessing the Risk of Diabetic Nephropathy: Prediction Model Study

Wen Y, Wan Z, Ren H, Wang X, Wang W

Interpretable Machine Learning Model for Predicting and Assessing the Risk of Diabetic Nephropathy: Prediction Model Study

JMIR Med Inform 2025;13:e64979

DOI: 10.2196/64979

PMID: 41124652

PMCID: 12543291

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Interpretable Machine Learning Model for Predicting and Risk Assessment of Diabetic Nephropathy

  • Yili Wen; 
  • Zhiqiang Wan; 
  • Huiling Ren; 
  • Xu Wang; 
  • Weijie Wang

ABSTRACT

Introduction: Diabetic Nephropathy (DN), severe complications of diabetes, is characterized by proteinuria, hypertension, and progressive renal function decline, potentially leading to end-stage renal disease (ESRD). DN's pathogenesis involves high glucose levels, oxidative stress, inflammation, and fibrosis, resulting in kidney changes such as glomerular basement membrane thickening and glomerulosclerosis. The International Diabetes Federation projects that by 2045, 783 million people will have diabetes, with 30%-40% of them developing DN. Early detection and intervention are crucial for preserving renal function, improving quality of life, eliminating cardiovascular complications, and reducing healthcare costs.

Methods:

This study utilized machine learning (ML) techniques to develop and validate a predictive model for DN, focusing on both high predictive accuracy and model interpretability. Data from 1,000 Type-2 diabetes patients, including 444 with DN and 556 without, were analyzed. Various ML algorithms, including decision trees, random forests, Extra Trees, AdaBoost, XGBoost, and LightGBM, were employed. Multiple imputation was used for handling missing data, and the Synthetic Minority Over-sampling Technique (SMOTE) addressed data imbalance. Model performance was evaluated with metrics such as accuracy, precision, recall, F1 score, specificity, and area under the curve (AUC). Explainable Machine Learning (XML) techniques like LIME and SHAP were used to enhance model transparency and interpretability.

Results:

XGBoost and LightGBM demonstrated superior performance, with XGBoost achieving the highest accuracy of 86.87%, a precision of 88.90%, a recall of 84.40%, an F1 score of 86.44%, and a specificity of 89.12%. LIME and SHAP analyses provided insights into the contribution of individual features to the prediction outcomes, identifying serum creatinine, C-peptide, albumin, and lipoproteins as significant predictors. Conclusion: The developed ML model not only provides a robust predictive tool for early diagnosis and risk assessment of DN but also ensures transparency and interpretability, crucial for clinical integration. By enabling early intervention and personalized treatment strategies, this model has the potential to improve patient outcomes and optimize healthcare resource utilization.


 Citation

Please cite as:

Wen Y, Wan Z, Ren H, Wang X, Wang W

Interpretable Machine Learning Model for Predicting and Assessing the Risk of Diabetic Nephropathy: Prediction Model Study

JMIR Med Inform 2025;13:e64979

DOI: 10.2196/64979

PMID: 41124652

PMCID: 12543291

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.