Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Jul 19, 2020
Date Accepted: Sep 27, 2020
Detecting Miscoded Diabetes Diagnosis Codes in EHR for Quality Improvement: A Temporal Deep Learning Approach
ABSTRACT
Background:
Diabetes affects more than 30 million patients across the US. With such a large disease burden, even a small error in classification can be significant. Currently billing codes, assigned at the time of a medical encounter, are the “gold standard” reflecting the actual diseases present in an individual, and thus in aggregate reflect disease prevalence in the population. These codes are generated by highly trained coders and by healthcare providers but are not always accurate.
Objective:
This work provides a scalable deep learning methodology to more accurately classify individuals with diabetes across multiple healthcare systems.
Methods:
We leveraged a Long Short Term Memory Dense Neural Network (LSTM-DNN) model to identify patients with/without diabetes using data from five acute care facilities with 187,187 patients and 275,407 encounters, incorporating data elements including laboratory test results, diagnostic/procedure codes, medications, demographic data, and admission information. Furthermore, a blinded physician panel reviewed discordant cases, providing an estimate of the total impact on the population.
Results:
When predicting the documented diagnosis of diabetes, our model achieved an 84% F1-score, 96% AUC-ROC, and 91% average precision on a heterogeneous dataset from 5 distinct health facilities. However, in 81% of cases where the model disagreed with the documented phenotype, a blinded physician panel agreed with the model. Taken together, this suggests that 4.3% of our studied population have either missing or improper diabetes diagnosis.
Conclusions:
This study demonstrates deep learning methods can improve clinical phenotyping, even when patient data are noisy, sparse and heterogeneous.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.