Accepted for/Published in: JMIR Formative Research
Date Submitted: Jun 16, 2023
Date Accepted: Nov 1, 2023
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Accountable Research Data Sharing in a German Medical Data Integration Center with FAIR geared provenance implementation: Proof of Concept Study
ABSTRACT
Background:
Secondary investigations on digital health records such as electronic patient data provided by German medical data integration centers pave the way to improved future patient care. However, only little information is captured about the level of integrity, traceability, and quality of the (sensitive) data elements. From a technical point of view, the widely accepted FAIR principles for data stewardship demand enrichment of data with provenance-related metadata. One solution for health data research is the FHIR provenance resource, which enables the interoperable expression of metadata.
Objective:
This study aims to establish provenance traces underpinning data integrity, reliability and hence trust in Electronic Health Records (EHR), thereby increasing the accountability of the medical data integration center. We present the implementation a proof of concept provenance library integrating international standards as a first step.
Methods:
Provenance provides information about the readiness for the reuse of a data element and serves as data governance supplier. We followed a tailor-made roadmap for a provenance framework, investigated the data integration steps along the extraction, transform and load (ETL) phases and according to a maturity model, and we then deduced requirements for a provenance library. Based on this approach, we developed a provenance model with associated metadata, and we implemented a proof of concept provenance class. In addition, we integrated the international W3C provenance standard, mapped the resulting provenance records to the interoperable health care standard FHIR and provided several representation formats. Finally, we measured and evaluated provenance traces measurements.
Results:
This study implements for the first time integrated provenance traces on data element level in a German medical data integration center. We have developed and implemented a practical method that combines the strength of quality- and health standard guided (meta)data management practices. We measured satisfying pipeline execution times and achieved high-levels of accuracy, reliability and accountability of processed clinical routine data. These outcomes should serve as a motivation to develop further tools for evidence-based and reliable electronic health record provision for secondary use.
Conclusions:
The presented engineering strategy is generic and thus can be applied to a wide range of data sets and beyond clinical research. Our results improve the traceability of data elements along the data life cycle. Consequently, the system mitigates risks since data analysis without knowledge of the origin and quality of all data elements is futile. Provenance provides information about the readiness for the reuse of a data element and serves as data governance supplier. The developed provenance class has the potential to significantly reduce the gap between academia and industry in healthcare. Future provenance analysis stands to benefit from the provenance traces.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.