Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Jul 16, 2019
Date Accepted: Oct 31, 2019
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Longitudinal risk prediction of chronic kidney disease in diabetic patients using temporal-enhanced gradient boosting machine
ABSTRACT
Background:
Artificial intelligence enabled electronic health record (EHR) analysis can revolutionize medical practice from diagnosis and prediction of complex diseases to making recommendations in patient care, especially for chronic conditions such as chronic kidney disease (CKD), one of the most frequent complications in diabetic patients associated with substantial morbidity and mortality.
Objective:
Longitudinal prediction of health outcome requires effective representation of temporal data in EHR. In this study, we propose a novel temporal-enhanced gradient boosting machine model that dynamically updates and ensembles learners based on new events in patient timelines to improve the prediction accuracy of CKD among diabetic patients.
Methods:
Using a broad spectrum of de-identified EHR data on a retrospective cohort of 14,039 adult patients with type 2 diabetes and Gradient Boosting Machine (GBM) as the base learner, we validated our proposed Landmark-Boosting model against three state-of-the-art temporal models for rolling predictions of 1-year DKD risk.
Results:
The proposed model uniformly outperformed other models, achieving area-under-receiver-operating-curve (AUROC) at 0.83 [95% CI, 0.76 – 0.85], 0.78 [95% CI, 0.75 – 0.82], 0.82 [95% CI, 0.78 – 0.86] in predicting DKD risk with automatic accumulation of new data in later years (year 2,3,4 since DM onset respectively). The Landmark-Boosting model also maintained the best calibration across moderate- and high-risk groups and over time. The experimental results demonstrated that the proposed temporal model can not only accurately predict 1-year DKD risk but can also improve performance over time with additionally accumulated data, which is essential for clinical use to improve renal management of diabetic patients.
Conclusions:
Incorporation of temporal information in EHR data can significantly improve predictive model performance and will particularly benefit those patients who follow-up with their physicians as recommended.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.