JMIR Preprints #35257: Development of a Natural Language Processing System for Assessing Quality Indicators from Free-Text Colonoscopy and Pathology Reports: Methodology Development and Applications

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Development of a Natural Language Processing System for Assessing Quality Indicators from Free-Text Colonoscopy and Pathology Reports: Methodology Development and Applications

Jung Ho Bae;
Hyun Wook Han;
Sun Young Yang;
Gyuseon Song;
Soonok Sa;
Goh Eun Chung;
Ji Yeon Seo;
Eun Hyo Jin;
Hee Cheon Kim;
DongUk An

ABSTRACT

Background:

Manual data extraction for colonoscopy quality indicators is time- and labor-intensive. Natural language processing (NLP), a computer-based linguistics and technique, offers the automation of reporting from unstructured free text reports to extract important clinical information. The application of information extraction using NLP includes identification of clinical information such as adverse events and clinical work optimization such as quality control and patient management.

Objective:

We developed a natural language processing pipeline to manage Korean–English colonoscopy reports and evaluated its performance on automatically assessing adenoma detection rate (ADR), sessile serrated lesion detection rate (SDR), and surveillance interval (SI).

Methods:

The NLP tool was developed using 2000 screening colonoscopy records (1425 pathology reports) at Seoul National University Hospital Gangnam Center. Tests were performed on another 1,000 colonoscopy records to compare a manual review (MR) by five human annotators and the NLP pipeline. Additionally, data from 54,562 colonoscopies of 12,264 patients (aged ≥50 years) from 2010 to 2019 were analyzed using the NLP pipeline for colonoscopy quality indicators.

Results:

The overall accuracy of the test dataset was 95.8% (958/1000) for NLP vs. 93.1% (931/1000) for MR (P=.008). The mean total ADR in the test set was 46.8% (468/1000) with NLP vs. 47.2% (472/1000) with MR. The mean total SDR was 6.4% (64/1000) with NLP vs. 6.5% (65/1000) with MR. Calculating the SI revealed a similar performance between both methods. The mean ADR and SDR of the 25 endoscopists in the 10-year dataset were 42.0% (881/2098) and 3.3% (69/2098), respectively, indicating wide individual variability (16.3% (263/1615)–56.2% (1014/1936) in ADR and 0.4% (6/1615)–6.6% (124/1876) in SDR). The SI recommendation suggested a large difference in ADR and SDR based on the endoscopist’s performance.

Conclusions:

The NLP pipeline can accurately and automatically calculate ADR, SDR, and SI from a multi-language colonoscopy report. It could be an important tool for improving colonoscopy quality and clinical decision support. Clinical Trial: This study was approved by the Institutional Review Board of SNUH (IRB 1909-093-670).

Citation

Please cite as:

Bae JH, Han HW, Yang SY, Song G, Sa S, Chung GE, Seo JY, Jin EH, Kim HC, An D

Natural Language Processing for Assessing Quality Indicators in Free-Text Colonoscopy and Pathology Reports: Development and Usability Study

JMIR Med Inform 2022;10(4):e35257

DOI: 10.2196/35257

PMID: 35436226

PMCID: 9055472

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Nov 29, 2021

Open Peer Review Period: Nov 28, 2021 - Dec 20, 2021

Date Accepted: Feb 25, 2022

Date Submitted to PubMed: Apr 18, 2022

(closed for review but you can still tweet)

Development of a Natural Language Processing System for Assessing Quality Indicators from Free-Text Colonoscopy and Pathology Reports: Methodology Development and Applications

ABSTRACT

Citation

Copyright