Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Mar 3, 2020
Date Accepted: Jun 28, 2020
Date Submitted to PubMed: Jul 14, 2020
Model-based Algorithms for Detecting Peripheral Artery Disease Using Administrative Data From an Electronic Health Record Data System
ABSTRACT
Background:
Peripheral artery disease (PAD) affects 8-10 million Americans, who face significantly elevated risks of both mortality and major limb events (such as amputation). Unfortunately, PAD is relatively under-diagnosed, under-treated, and under-researched, leading to wide variations in treatment patterns and outcomes. Efforts to improve PAD care and outcomes have been hampered by persistent difficulties identifying PAD patients for clinical and investigatory purposes.
Objective:
The goal was to develop and validate a model-based algorithm to detect patients with peripheral artery disease (PAD) using data from an electronic health record (EHR) system
Methods:
An initial query of the EHR in a large health system identified all patients with PAD-related diagnosis codes for any encounter during the study period. Clinical adjudication of PAD diagnosis was performed by chart review on a random subgroup. A binary logistic regression to predict PAD was built and validated using a Least Absolute Shrinkage and Selection Operator approach in the adjudicated patients. The algorithm was then applied to the non-sampled records to further evaluate its performance.
Results:
The initial EHR data query using 406 diagnostic codes yielded 15,406 patients. 2,500 patients were randomly selected for ground truth PAD status adjudication. 108 code flags remained after removing rarely- and never-used codes. We entered these code flags plus administrative encounter, imaging, procedure, and specialist flags into a LASSO model. The AUC for this model was 0.862.
Conclusions:
The algorithm we constructed has two main advantages over other approaches to PAD patient identification. First, it was derived from a broad population of patients with many different PAD manifestations and treatment pathways across a large health system. Second, our model does not rely on clinical notes and can be applied in situations in which only administrative billing data (e.g. large administrative datasets) are available. A combination of diagnosis codes and administrative flags can accurately identify patients with PAD in large cohorts.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.