Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Feb 2, 2024
Date Accepted: Jul 21, 2024
Targeted development and validation of clinical prediction models in secondary care settings – opportunities and challenges for electronic health records data
ABSTRACT
Before deploying a clinical prediction model (CPM) in clinical practice, its performance needs to be demonstrated in the population of intended use. This is also called 'targeted validation'. Many CPMs developed in tertiary settings may be most useful in secondary care, where the patient case mix is broad and practitioners need to triage patients efficiently. However, since structured and/or rich datasets of sufficient quality from secondary to assess the performance of a CPM are scarce, a validation gap exists that hampers implementation of CPMs in secondary care settings. In this viewpoint, we highlight the importance of targeted validation and the use of clinical prediction models (CPMs) in secondary care settings and discuss the potential and challenges of using Electronic Health Record (EHR) data to overcome the existing validation gap. The introduction of software applications for text mining of EHRs allows the generation of structured 'big' datasets, but the imperfection of EHRs as a research database requires careful validation of data quality. When using EHR data for the development and validation of CPMs, in addition to widely accepted checklists, we propose considering three additional practical steps: 1) Involve a local EHR expert (clinician, nurse) in the data extraction process, 2) Perform validity checks on the generated datasets, and 3) Provide metadata on how variables were constructed from EHRs. These steps help to generated EHR datasets that are statistically powerful, of sufficient quality and replicable and enable targeted development and validation of CPMs in secondary care settings. This approach can fill a major gap in prediction modeling research and appropriately advance CPMs into clinical practice. In this viewpoint, we highlight the importance of targeted validation and the use of clinical prediction models (CPMs) in secondary care settings and discuss the potential and challenges of using Electronic Health Record (EHR) data to overcome the existing validation gap. The introduction of software applications for text mining of EHRs allows the generation of structured 'big' datasets, but the imperfection of EHRs as a research database requires careful validation of data quality. When using EHR data for the development and validation of CPMs, in addition to widely accepted checklists, we propose considering three additional practical steps: 1) Involve a local EHR expert (clinician, nurse) in the data extraction process, 2) Perform validity checks on the generated datasets, and 3) Provide metadata on how variables were constructed from EHRs. If successful, such datasets are statistically powerful and enable targeted development and validation of CPMs in secondary care settings. This approach can fill a major gap in prediction modeling research and appropriately advance CPMs into clinical practice.
Citation
Request queued. Please wait while the file is being generated. It may take some time.