Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Apr 26, 2020
Date Accepted: Nov 11, 2020
Date Submitted to PubMed: Nov 12, 2020

The final, peer-reviewed published version of this preprint can be found here:

Proposal and Assessment of a De-Identification Strategy to Enhance Anonymity of the Observational Medical Outcomes Partnership Common Data Model (OMOP-CDM) in a Public Cloud-Computing Environment: Anonymization of Medical Data Using Privacy Models

Jeon S, Seo J, Kim S, Lee J, Kim J, Sohn J, Moon J, Joo HJ

Proposal and Assessment of a De-Identification Strategy to Enhance Anonymity of the Observational Medical Outcomes Partnership Common Data Model (OMOP-CDM) in a Public Cloud-Computing Environment: Anonymization of Medical Data Using Privacy Models

J Med Internet Res 2020;22(11):e19597

DOI: 10.2196/19597

PMID: 33177037

PMCID: 7728527

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Proposal and Assessment of De-identification Strategy to Enhance Anonymity of Observational Medical Outcomes Partnership Common Data Model in Public Cloud Computing Environment: Study for Medical Data Anonymity

  • Seungho Jeon; 
  • Jeongeun Seo; 
  • Sukyoung Kim; 
  • Jeongmoon Lee; 
  • Jongho Kim; 
  • Jangwook Sohn; 
  • Jongsub Moon; 
  • Hyung Joon Joo

ABSTRACT

Background:

The Observational Medical Outcomes Partnership Common Data Model (OMOP-CDM) defined by the non-profit organization, Observational Health Data Sciences and Informatics (OHDSI), is gaining attention for its use in the analysis of patient-level clinical data from various medical institutions. To analyze these data in a public environment, such as a cloud system, an appropriate de-identification strategy is required.

Objective:

This study proposes a de-identification strategy, which is composed of several rules used along with the k-anonymity, l-diversity, and t-closeness privacy models. Then, the proposed strategy is evaluated in the actual CDM database.

Methods:

The CDM database used in this study was constructed by the Anam Hospital of Korea University. For analysis and evaluation, the ARX anonymizing framework was used with the k-anonymity, l-diversity, and t-closeness models.

Results:

The CDM database, constructed according to the rules established by OHDSI, exhibited a low risk of re-identification. The DRUG_EXPOSURE table exhibited the highest re-identifiable record rate in the dataset (11.3%) with a re-identification success rate of 0.03%. However, because all tables include at least one ‘highest risk’ value of 100%, suitable anonymizing techniques are needed. Because the CDM database preserves the ‘source values’ (raw data), and the combination of source values increases the risk of re-identification, this study proposes an enhanced de-identification strategy for the source values. When applying this strategy, the highest risk in the k-anonymity, l-diversity, and t-closeness privacy models is significantly reduced, and the overall possibility of re-identification is also reduced.

Conclusions:

Thus, through de-identification via our proposed method, the privacy of the CDM database can be improved. Based on the enhanced privacy of the CDM database, clinical research involving multiple centers is expected to be encouraged.


 Citation

Please cite as:

Jeon S, Seo J, Kim S, Lee J, Kim J, Sohn J, Moon J, Joo HJ

Proposal and Assessment of a De-Identification Strategy to Enhance Anonymity of the Observational Medical Outcomes Partnership Common Data Model (OMOP-CDM) in a Public Cloud-Computing Environment: Anonymization of Medical Data Using Privacy Models

J Med Internet Res 2020;22(11):e19597

DOI: 10.2196/19597

PMID: 33177037

PMCID: 7728527

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.