Accepted for/Published in: JMIR Formative Research
Date Submitted: Mar 27, 2019
Date Accepted: Dec 16, 2019
Privacy-Preserving Deep Learning for the Detection of Protected Health Information in Real-World Data: A Feasibility Study
ABSTRACT
Background:
Collaborative privacy-preserving training methods allow the integration of locally stored private data sets into machine learning approaches while ensuring confidentiality and non-disclosure.
Objective:
In the present work we assess the performance of a state-of-the-art neural network (NN) approach for the detection of protected health information in texts trained in a collaborative privacy-preserving way.
Methods:
The training adopts distributed selective stochastic gradient descent, i.e. it works by exchanging local learning results achieved on private data sets.
Results:
5 networks trained on separated real-world clinical data sets by utilising the privacy-protecting protocol reach a mean F1 value of 0.955. The gold standard centralised training that is based on the union of all sets and does not take data security into consideration reaches a final value of 0.962.
Conclusions:
Thus, using real-world clinical data our study shows that detection of protected health information can be secured by collaborative privacy-preserving training. In general, the approach shows the feasibility of deep learning on distributed and confidential clinical data while ensuring data protection.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.