JMIR Preprints #13046: Evaluation of Privacy Risks of Patients’ Data in China: Case Study

Current Preprint Settings

(as selected by the authors)

1. Allow access to the preprint PDF upon submission to:

(a) Open peer-review purposes
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) Nobody

2. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) Nobody

3. When a final paper is published in a JMIR journal, display the preprint as follows:

(a) Allow download
(b) Show abstract only
(c) Do not display anything

4. If the paper is rejected from JMIR journals, display the preprint to:

(a) Logged-in users only
(b) Anybody, anytime
(c) Nobody

Evaluation of Privacy Risks of Patients’ Data in China: Case Study

Mengchun Gong;
Shuang Wang;
Lezi Wang;
Chao Liu;
Jianyang Wang;
Qiang Guo;
Hao Zheng;
Kang Xie;
Chenghong Wang;
Zhouguang Hui

Background:

Patient privacy is a ubiquitous problem around the world. Many existing studies have demonstrated the potential privacy risks associated with sharing of biomedical data. Owing to the increasing need for data sharing and analysis, health care data privacy is drawing more attention. However, to better protect biomedical data privacy, it is essential to assess the privacy risk in the first place.

Objective:

In China, there is no clear regulation for health systems to deidentify data. It is also not known whether a mechanism such as the Health Insurance Portability and Accountability Act (HIPAA) safe harbor policy will achieve sufficient protection. This study aimed to conduct a pilot study using patient data from Chinese hospitals to understand and quantify the privacy risks of Chinese patients.

Methods:

We used g-distinct analysis to evaluate the reidentification risks with regard to the HIPAA safe harbor approach when applied to Chinese patients’ data. More specifically, we estimated the risks based on the HIPAA safe harbor and limited dataset policies by assuming an attacker has background knowledge of the patient from the public domain.

Results:

The experiments were conducted on 0.83 million patients (with data field of date of birth, gender, and surrogate ZIP codes generated based on home address) across 33 provincial-level administrative divisions in China. Under the Limited Dataset policy, 19.58% (163,262/833,235) of the population could be uniquely identifiable under the g-distinct metric (ie, 1-distinct). In contrast, the Safe Harbor policy is able to significantly reduce privacy risk, where only 0.072% (601/833,235) of individuals are uniquely identifiable, and the majority of the population is 3000 indistinguishable (ie the population is expected to share common attributes with 3000 or less people).

Conclusions:

Through the experiments based on real-world patient data, this work illustrates that the results of g-distinct analysis about Chinese patient privacy risk are similar to those from a previous US study, in which data from different organizations/regions might be vulnerable to different reidentification risks under different policies. This work provides reference to Chinese health care entities for estimating patients’ privacy risk during data sharing, which laid the foundation of privacy risk study about Chinese patients’ data in the future.

Citation

Please cite as:

Gong M, Wang S, Wang L, Liu C, Wang J, Guo Q, Zheng H, Xie K, Wang C, Hui Z

Evaluation of Privacy Risks of Patients’ Data in China: Case Study

JMIR Med Inform 2020;8(2):e13046

DOI: 10.2196/13046

PMID: 32022691

PMCID: 7055805

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Dec 15, 2018

Open Peer Review Period: Dec 18, 2018 - Feb 12, 2019

Date Accepted: Sep 26, 2019

(closed for review but you can still tweet)

Evaluation of Privacy Risks of Patients’ Data in China: Case Study

Citation

Copyright

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Dec 15, 2018

Open Peer Review Period: Dec 18, 2018 - Feb 12, 2019

Date Accepted: Sep 26, 2019

(closed for review but you can still tweet)

Evaluation of Privacy Risks of Patients’ Data in China: Case Study

Citation

Per the author's request the PDF is not available.

Copyright