Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Nov 7, 2022
Open Peer Review Period: Oct 26, 2022 - Dec 21, 2022
Date Accepted: Jan 25, 2023
(closed for review but you can still tweet)
FHIR-DHP: A Standardized Clinical Data Harmonisation Pipeline for scalable AI application deployment
ABSTRACT
Background:
Increasing digitalisation in the medical domain gives rise to large amounts of healthcare data which has the potential to expand clinical knowledge and transform patient care if leveraged through artificial intelligence (AI). Yet, big data and AI oftentimes cannot unlock their full potential at scale, owing to non-standardised data formats, lack of technical and semantic data interoperability, and limited cooperation between stakeholders in the healthcare system. Despite the existence of standardised data formats for the medical domain, such as Fast Healthcare Interoperability Resources (FHIR), their prevalence and usability for AI remains limited.
Objective:
We developed a data harmonisation pipeline (DHP) for clinical data sets relying on the common FHIR data standard.
Methods:
We validated the performance and usability of our FHIR-DHP with data from the MIMIC IV database including > 40,000 patients admitted to an intensive care unit.
Results:
We present the FHIR-DHP workflow in respect of transformation of “raw” hospital records into a harmonised, AI-friendly data representation. The pipeline consists of five key preprocessing steps: querying of data from hospital database, FHIR mapping, syntactic validation, transfer of harmonised data into the patient-model database and export of data in an AI-friendly format for further medical applications. A detailed example of FHIR-DHP execution was presented for clinical diagnoses records.
Conclusions:
Our approach enables scalable and needs-driven data modelling of large and heterogenous clinical data sets. The FHIR-DHP is a pivotal step towards increasing cooperation, interoperability and quality of patient care in the clinical routine and for medical research. Clinical Trial: Data interoperability, FHIR, data standardisation pipeline, MIMIC IV
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.