Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Sep 1, 2025
Date Accepted: Mar 24, 2026

The final, peer-reviewed published version of this preprint can be found here:

Rethinking Trust in Synthetic Health Data: Lessons From 7 European Research Initiatives

Declerck J, Kalra D, Airola A, Amer AYA, Chatzichristos C, del Mar Mañu M, M. de Brito Robalo B, Ghini F, Gutierrez-Torre A, Hoogteijling S, Hultsch S, Ramon J, Reidel S, Regazzoni F, Silva L, Silveira I, Maes C

Rethinking Trust in Synthetic Health Data: Lessons From 7 European Research Initiatives

J Med Internet Res 2026;28:e83369

DOI: 10.2196/83369

PMID: 42054696

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Building Trust in Synthetic Health Data: Methodological and Acceptance Challenges across Seven European Projects

  • Jens Declerck; 
  • Dipak Kalra; 
  • Antti Airola; 
  • Ahmed Youssef Ali Amer; 
  • Christos Chatzichristos; 
  • Maria del Mar Mañu; 
  • Bruno M. de Brito Robalo; 
  • Francesco Ghini; 
  • Alberto Gutierrez-Torre; 
  • Sem Hoogteijling; 
  • Susanne Hultsch; 
  • Jan Ramon; 
  • Sara Reidel; 
  • Francesco Regazzoni; 
  • Luís Silva; 
  • Inês Silveira; 
  • Christophe Maes

ABSTRACT

Background:

Synthetic data generation (SDG) is increasingly used in health research to address privacy concerns and data sharing barriers, particularly in cross-border and multi-institutional contexts. Despite growing interest, questions remain around the methodological soundness, legal status, and real-world trustworthiness of synthetic health data.

Objective:

This paper examines how SDG is operationalized across seven European projects in the HealthData4EU cluster. It identifies common methodological, compliance, and acceptance challenges, and highlights emerging strategies to ensure the quality and utility of synthetic health data.

Methods:

A qualitative multiple case study approach was used, combining workshop presentations with structured input from each project. Projects were analyzed across four aspects: methodological challenges, data quality assurance, trust and transparency strategies, and contributions to the wider synthetic data ecosystem.

Results:

Results reveal recurring challenges across projects: tension between privacy and utility, lack of shared quality validation standards, institutional readiness gaps, and legal ambiguity under GDPR. Projects employ varied strategies, including federated machine learning and validation pipelines, differential privacy, co-design processes, and metadata transparency tools, but implementation remains uneven. The absence of harmonised legal and evaluation frameworks continues to hinder broader uptake.

Conclusions:

To scale SDG from isolated pilots to trusted European infrastructure, four priorities are critical: shared evaluation metrics, legal clarity, sustainable platforms, and embedded stakeholder engagement. While rooted in specific projects, the insights offer broader relevance for global SDG efforts, as the core tensions identified are likely to recur across health systems.


 Citation

Please cite as:

Declerck J, Kalra D, Airola A, Amer AYA, Chatzichristos C, del Mar Mañu M, M. de Brito Robalo B, Ghini F, Gutierrez-Torre A, Hoogteijling S, Hultsch S, Ramon J, Reidel S, Regazzoni F, Silva L, Silveira I, Maes C

Rethinking Trust in Synthetic Health Data: Lessons From 7 European Research Initiatives

J Med Internet Res 2026;28:e83369

DOI: 10.2196/83369

PMID: 42054696

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.