JMIR Preprints #46322: DEFT: a web-based system for DE-identifying Free Text data in electronic medical records using human in the loop deep learning

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

DEFT: a web-based system for DE-identifying Free Text data in electronic medical records using human in the loop deep learning

Leibo Liu;
Oscar Perez-Concha;
Anthony Nguyen;
Vicki Bennett;
Victoria Blake;
Blanca Gallego Luxan;
Louisa Jorm

ABSTRACT

Background:

The valuable narrative free text in Electronic Medical Records (EMRs) must be de-identified by removing Personally Identifiable Information (PII) before releasing it for secondary use. Manual de-identification is time-consuming and labour-intensive. Existing de-identification systems have a steep learning curve.

Objective:

We sought to develop an accurate, web-based system for de-identifying free text in EMRs, which can be readily and easily adopted in real-world settings.

Methods:

DEFT was designed with the goals of easy adoption and rapid and secure de-identification at high accuracy. It provides a simple and task-focused web user interface for users to easily perform the de-identification work. An interactive learning loop powered by a state-of-the-art deep learning model is integrated into DEFT to speed up the de-identification process and increase its performance over time.

Results:

DEFT has advantages over existing systems in terms of its support for project management, user access control, data management, and an interactive learning process. In a real-world use case of de-identifying clinical notes, which were extracted from one referral hospital in Sydney, Australia, DEFT achieved a high F1 score of 95.07% using 600 annotated clinical notes.

Conclusions:

The DEFT system can be rapidly deployed for de-identifying free text in EMRs. End users with minimal technical knowledge can perform the de-identification work with only a shallow learning curve.

Citation

Please cite as:

Liu L, Perez-Concha O, Nguyen A, Bennett V, Blake V, Gallego Luxan B, Jorm L

Web-Based Application Based on Human-in-the-Loop Deep Learning for Deidentifying Free-Text Data in Electronic Medical Records: Development and Usability Study

Interact J Med Res 2023;12:e46322

DOI: 10.2196/46322

PMID: 37624624

PMCID: 10492176

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Interactive Journal of Medical Research

Date Submitted: Feb 8, 2023

Date Accepted: Jul 24, 2023

DEFT: a web-based system for DE-identifying Free Text data in electronic medical records using human in the loop deep learning

ABSTRACT

Citation

Copyright