Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Apr 19, 2024
Open Peer Review Period: May 16, 2024 - Jul 11, 2024
Date Accepted: Jan 30, 2025
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Federated Analysis With Differential Privacy in Oncology Research: Longitudinal Observational Study Across Hospital Data Warehouses

Riffel T, Créquit P, Baillet M, Paumier J, Marfoq Y, Sy R, Girardot O, Chanet T, Bayssat L, Mazières J, Vuiblet V, Ancel J, Dewolf M, Margraff F, Bachot C, Chmiel J

Federated Analysis With Differential Privacy in Oncology Research: Longitudinal Observational Study Across Hospital Data Warehouses

JMIR Med Inform 2025;13:e59685

DOI: 10.2196/59685

PMID: 40743474

PMCID: 12312987

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Distributed Analytics for Research in Hospitals (DARAH): Federated Analysis with Differential Privacy in a Real-World Oncology Study

  • Théo Riffel; 
  • Perrine Créquit; 
  • Maëlle Baillet; 
  • Jason Paumier; 
  • Yasmine Marfoq; 
  • Ronan Sy; 
  • Olivier Girardot; 
  • Thierry Chanet; 
  • Louise Bayssat; 
  • Julien Mazières; 
  • Vincent Vuiblet; 
  • Julien Ancel; 
  • Maxime Dewolf; 
  • François Margraff; 
  • Camille Bachot; 
  • Jacek Chmiel

ABSTRACT

Background:

Federated analytics in healthcare allows researchers to perform statistical queries on remote data sets without access to the raw data. This method arose from the need to perform statistical analysis on larger datasets collected at multiple healthcare centers while avoiding regulatory, governance, and privacy issues that might arise if raw data were collected at a central location outside the healthcare centers. Despite some pioneering work, federated analytics is still not widely used on real-world data, and to our knowledge, no real-world study has yet combined it with other privacy-enhancing techniques such as differential privacy.federated analysis, differential privacy, real-world oncology study, non-small cell lung cancer, COVID-19federated analysis, differential privacy, real-world oncology study, non-small cell lung cancer, COVID-19

Objective:

The first objective of this study was to deploy a federated architecture in a real-world setting. The oncology study used for this deployment compared the medical healthcare management of patients with metastatic non-small cell lung cancer before and during/after the 1st wave of COVID-19. The second goal was to test differential privacy in this real-world scenario to assess its practicality and utility as a privacy enhancing technology.

Methods:

A federated architecture platform was set up in the Toulouse, Reims and Foch centers. After harmonization of the data in each center, statistical analyses were performed using DataSHIELD, a federated analysis R library and a new open source differential privacy DataSHIELD package was implemented: dsPrivacy.

Results:

50 patients were enrolled in the Toulouse and Reims centers and 49 in the Foch center. We have shown that DataSHIELD is a practical tool to efficiently conduct our study across all 3 centers without exposing data on a central node, once sufficient setup has been made to configure a secure network between hospitals. All planned aggregated results were successfully generated. We also observed that differential privacy can be implemented in practice with promising trade-offs between privacy and accuracy, and we built a library that will prove useful for future work.

Conclusions:

The federated architecture platform enabled a multicenter study to be conducted on real-world oncology data with strong privacy guarantees thanks to differential privacy.


 Citation

Please cite as:

Riffel T, Créquit P, Baillet M, Paumier J, Marfoq Y, Sy R, Girardot O, Chanet T, Bayssat L, Mazières J, Vuiblet V, Ancel J, Dewolf M, Margraff F, Bachot C, Chmiel J

Federated Analysis With Differential Privacy in Oncology Research: Longitudinal Observational Study Across Hospital Data Warehouses

JMIR Med Inform 2025;13:e59685

DOI: 10.2196/59685

PMID: 40743474

PMCID: 12312987

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.