Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Research Protocols

Date Submitted: Feb 13, 2023
Open Peer Review Period: Feb 13, 2023 - Apr 10, 2023
Date Accepted: Jun 28, 2023
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Data Quality– and Utility-Compliant Anonymization of Common Data Model–Harmonized Electronic Health Record Data: Protocol for a Scoping Review

Kamdje Wabo G, Prasser F, Gierend K, Siegel F, Ganslandt T

Data Quality– and Utility-Compliant Anonymization of Common Data Model–Harmonized Electronic Health Record Data: Protocol for a Scoping Review

JMIR Res Protoc 2023;12:e46471

DOI: 10.2196/46471

PMID: 37566443

PMCID: 10457704

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Data Quality- and Utility-compliant Anonymization of Electronic Health Record Data in the context of Multiple Common Data Models and Research Data Standard: Protocol for a Scoping Review

  • Gaetan Kamdje Wabo; 
  • Fabian Prasser; 
  • Kerstin Gierend; 
  • Fabian Siegel; 
  • Thomas Ganslandt

ABSTRACT

Background:

The anonymization of EHR data is essential to ensure privacy protection in secondary use scenarios. For interoperability reasons, a range of data-driven medical research projects adopted Common Data Models (CDMs) which represent data in a quality- and utility-compliant way suitable for research. Few reviews investigate CDM-based implementations of formal data anonymization processes with a reflection of data quality and –utility issues.

Objective:

This scoping review will investigate the state-of-the-art regarding how formal data anonymization processes are applied on medical research CDMs and data representation standards, and to what extent strategies or gaps in dealing with quality problems of resulting anonymized datasets are observed.

Methods:

In developing the protocol for this review, we used the framework of Arksey and O'Malley. Based on this, we will include only articles published in English and available through the databases PubMed and Web of Sciences. The literature search will be based on a query syntax validated by a librarian, and accompanied by manual queries to include further informal sources. Eligible references will undergo a de-duplication step, followed by a screening of papers titles and abstracts. In a second phase, a full-text reading will allow the final selection of the corresponding articles, while a domain expert will support resolving literature selection conflicts. During this phase, key information will be extracted, categorized, summarized, and analyzed with reference to a template-based structure. Tabulated and graphical analyses will be included in alignment with the PRISMA-ScR model. We also performed some tentative searches on the both target literature databases for estimating the retrievability of eligible papers.

Results:

The tentative searches of the PubMed and Web of Sciences databases resulted in 119 and 296 de-duplicated matches respectively, suggesting the availability of (potentially) relevant articles. Further analysis and selection steps will allow reaching a final literature set. The completion of this scoping review project is foreseen to take place by the end of the second quarter of 2023.

Conclusions:

Outlining approaches to deploy formal data anonymization on CDMs with a consideration of potentially associated data quality and utility issues will provide useful insights to increase the preservation of fitness-for-use of anonymized data in the scientific usage of CDMs. This protocol describes a schedule to perform a scoping review, to address the existing evidence and challenges, which will support the conduction of follow-up investigations.


 Citation

Please cite as:

Kamdje Wabo G, Prasser F, Gierend K, Siegel F, Ganslandt T

Data Quality– and Utility-Compliant Anonymization of Common Data Model–Harmonized Electronic Health Record Data: Protocol for a Scoping Review

JMIR Res Protoc 2023;12:e46471

DOI: 10.2196/46471

PMID: 37566443

PMCID: 10457704

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.