Accepted for/Published in: JMIR Formative Research
Date Submitted: Jun 29, 2021
Date Accepted: Jan 24, 2022
Date Submitted to PubMed: Jan 26, 2022
Performance of a Computational Phenotyping Algorithm for Sarcoidosis Using Diagnostic Codes in Electronic Medical Records: A Pilot Study from Two Veterans Affairs Medical Centers
ABSTRACT
Background:
Electronic Medical Records (EMR) offer the promise of computationally identifying sarcoidosis cases. However, the accuracy of identifying those cases in the EMR is unknown.
Objective:
To determine the statistical performance of using the International Classification of Diseases (ICD) diagnostic codes to identify patients with sarcoidosis in the EMR.
Methods:
We used the ICD diagnostic codes to identify sarcoidosis cases by searching the EMR of the San Francisco and Palo Alto Veterans Affairs medical centers and randomly selecting 200 patients. To improve the diagnostic accuracy of the computational algorithm in cases where histopathological data are unavailable, we developed an “index of suspicion” to identify cases with a “high index of suspicion” for sarcoidosis (confirmed and probable) based on clinical and radiographic features alone using the American Thoracic Society practice guideline. Through medical record review, we determined the positive predictive value (PPV) of diagnosing sarcoidosis by two computational methods using ICD codes alone and ICD plus the “high index of suspicion.”
Results:
Among the 200 patients, 158 had a high index of suspicion for sarcoidosis, and 142 had documentation of non-necrotizing granuloma confirming biopsy-proven sarcoidosis. The PPV of using ICD code alone was 79% (95% CI=78.6%–80.5%) for identifying sarcoidosis cases, and 71% (95% CI= 64.7%–77.3%) for identifying histopathological-confirmed sarcoidosis in the EMR. The inclusion of the generated “high index of suspicion” to identify confirmed sarcoidosis cases increased the PPV significantly to 100 % (95 % CI= 96.5 %–100 %). Histopathology documentation alone was 90% sensitive when compared with “high index of suspicion.”
Conclusions:
ICD codes are reasonable classifiers to identify sarcoidosis cases within EMR with a positive predictive value of 79%. Using a computational algorithm to capture “index of suspicion” data elements could significantly improve the case identification accuracy.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.