Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Jan 28, 2020
Open Peer Review Period: Jun 1, 2020 - Jun 12, 2020
Date Accepted: Oct 31, 2020
Date Submitted to PubMed: Nov 6, 2020
(closed for review but you can still tweet)
Development of phenotyping algorithms for the identification of organ transplant recipients: Cohort Study
ABSTRACT
Background:
Studies involving organ transplant recipients (OTR) are often limited to the variables collected in the national Scientific Registry of Transplant Recipients database. The electronic health record (EHR) contains additional variables that can augment this data source if OTR can be identified accurately.
Objective:
To develop methods to identify OTR from the EHR.
Methods:
We used Vanderbilt’s de-identified version of its EHR database that contains nearly 3 million subjects to develop algorithms to identify organ transplant recipients. We identified all 19,821 individuals with at least one ICD or CPT code for organ transplantation. We performed chart review on 1,250 randomly-selected individuals to determine transplant status. We constructed multiple machine learning models to calculate positive predictive values and sensitivity for combinations of codes.
Results:
Of the 1,250 reviewed patient charts, 740 were transplant recipients, while 498 had no record of a transplant, and 12 were equivocal. Most patients with only one or two transplant codes did not have a transplant. The most common reasons for being labeled a non-transplant patient were a lack of data (n = 222, 44.2%), or the patient being evaluated for an organ transplant (n = 159, 31.7%). The most robust model was a random forest that identified organ transplant recipients with overall 97% PPV and 94% sensitivity.
Conclusions:
Electronic health records (EHR) linked to biobanks are increasingly used to conduct large-scale studies, but have not been well-utilized in organ transplantation research. We present validated methods for identifying OTR from the EHR that will enable the use of the full spectrum of clinical data in transplant research. Using several different methods, we were able to identify transplant cases with high accuracy using ICD and CPT codes.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.