Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Jul 1, 2025
Open Peer Review Period: Jul 8, 2025 - Sep 2, 2025
Date Accepted: Apr 26, 2026
(closed for review but you can still tweet)
INTEROPERABLE INTEGRATION OF A NATIONAL RARE DISEASE REGISTRY INTO A RARE EYE DISEASE DATA WAREHOUSE: AN IMPLEMENTATION STUDY
ABSTRACT
Background:
In France, clinical data on rare diseases are primarily collected through BaMaRa, a software platform used by national expert centers to populate the BNDMR, the rare disease data warehouse. BaMaRa ensures standardized and structured data collection across all rare disease networks, with a focus on care coordination and epidemiological reporting. In 2024, FREDD, a health data warehouse dedicated to rare eye diseases, was developed within the framework of the third French National Rare Disease Plan, by the SENSGENE sector. Despite overlapping datasets, there is no native interoperability between BaMaRa and FREDD, requiring the development of a dedicated, traceable pipeline to transform BaMaRa exports into data suitable for inclusion in the warehouse. This transformation involves complex business rules to address structural, semantic, and specific differences between the two systems.
Objective:
This article aims to describe the design and implementation of a robust data transformation pipeline that enables the automated conversion of BaMaRa clinical records into a structured dataset aligned with the FREDD data model. The primary goal is to ensure that data remain semantically consistent and reusable for secondary use of health data.
Methods:
We developed a Python-based application, called FREDDEX, that integrates several configuration files encoding the domain-specific business rules required to align BaMaRa data with the FREDD schema. These rules include mapping of variable names and values, management of multi-source redundancy and data quality checks. The system was designed to be modular, auditable, and usable by clinical data managers with minimal technical expertise. FREDDEX was tested using synthetic test cases and then validated on real-world data from the CHU de Strasbourg.
Results:
The application FREDDEX successfully processed and transformed BaMaRa exports from multiple centers, converting patient records into the FREDD format. Business rules were encoded and the tool enabled rapid onboarding of new clinical centers and significantly reduced manual curation time. Importantly, it also established a reproducible framework that can be adapted to other rare disease data reuse contexts, supporting interoperability with national and European platforms such as the ERN-EYE.
Conclusions:
This automated ETL process ensures the robust, standardized, and traceable reuse of BaMaRa data within FREDD. By integrating complex business rules and quality controls, it strengthens the interoperability and reliability of rare disease datasets, paving the way for large-scale research while reducing the burden on clinical teams.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.