Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Aug 25, 2024
Open Peer Review Period: Sep 9, 2024 - Nov 4, 2024
Date Accepted: Jan 31, 2025
(closed for review but you can still tweet)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Integrating healthcare data in an i2b2 model persisted through Elasticsearch
ABSTRACT
The volume of digital data in healthcare is continually growing. In addition to being used in healthcare, the health data collected can also be used for secondary purposes, such as research. In this context, Clinical Data Warehouses (CDW) provide the infrastructure and organization needed to improve the secondary use of health data. Various data models have been proposed for organizing data in a CDW, including the i2b2 model, whose persistence is based on a relational database that can present performance problems when executing queries on massive data. In this article, we evaluate the technical feasibility and performance of an i2b2 implementation with the NoSQL database system Elasticsearch using the Bordeaux University Hospital CDW, which includes data on 2.5 million patients and over 3 billion observations. We propose adaptations of the i2b2 model to take into account the specific features of Elasticsearch. We demonstrate that an Elasticsearch implementation is feasible, with a significant improvement in query performance and for disk space used for storage. This implementation is currently used in production at Bordeaux University Hospital.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.