Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Nov 29, 2021
Open Peer Review Period: Nov 28, 2021 - Dec 20, 2021
Date Accepted: Feb 25, 2022
Date Submitted to PubMed: Apr 18, 2022
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Natural Language Processing for Assessing Quality Indicators in Free-Text Colonoscopy and Pathology Reports: Development and Usability Study

Bae JH, Han HW, Yang SY, Song G, Sa S, Chung GE, Seo JY, Jin EH, Kim HC, An D

Natural Language Processing for Assessing Quality Indicators in Free-Text Colonoscopy and Pathology Reports: Development and Usability Study

JMIR Med Inform 2022;10(4):e35257

DOI: 10.2196/35257

PMID: 35436226

PMCID: 9055472

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Development of a Natural Language Processing Pipeline for Calculating Colonoscopy Quality Indicators: Comparison of Manual Review and Natural Language Processing

  • Jung Ho Bae; 
  • Hyun Wook Han; 
  • Sun Young Yang; 
  • Gyuseon Song; 
  • Soonok Sa; 
  • Goh Eun Chung; 
  • Ji Yeon Seo; 
  • Eun Hyo Jin; 
  • Hee Cheon Kim; 
  • DongUk An

ABSTRACT

Background:

Manual data extraction for colonoscopy quality indicators is time- and labor-intensive. Natural language processing (NLP), a computer-based linguistics and technique, offers the automation of reporting from unstructured free text reports to extract important clinical information. The application of information extraction using NLP includes identification of clinical information such as adverse events and clinical work optimization such as quality control and patient management.

Objective:

We developed a natural language processing pipeline to manage Korean–English colonoscopy reports and evaluated its performance on automatically assessing adenoma detection rate (ADR), sessile serrated lesion detection rate (SDR), and surveillance interval (SI).

Methods:

The NLP tool was developed using 2000 screening colonoscopy records (1425 pathology reports) at Seoul National University Hospital Gangnam Center. Tests were performed on another 1,000 colonoscopy records to compare a manual review (MR) by five human annotators and the NLP pipeline. Additionally, data from 54,562 colonoscopies of 12,264 patients (aged ≥50 years) from 2010 to 2019 were analyzed using the NLP pipeline for colonoscopy quality indicators.

Results:

The overall accuracy of the test dataset was 95.8% (958/1000) for NLP vs. 93.1% (931/1000) for MR (P=.008). The mean total ADR in the test set was 46.8% (468/1000) with NLP vs. 47.2% (472/1000) with MR. The mean total SDR was 6.4% (64/1000) with NLP vs. 6.5% (65/1000) with MR. Calculating the SI revealed a similar performance between both methods. The mean ADR and SDR of the 25 endoscopists in the 10-year dataset were 42.0% (881/2098) and 3.3% (69/2098), respectively, indicating wide individual variability (16.3% (263/1615)–56.2% (1014/1936) in ADR and 0.4% (6/1615)–6.6% (124/1876) in SDR). The SI recommendation suggested a large difference in ADR and SDR based on the endoscopist’s performance.

Conclusions:

The NLP pipeline can accurately and automatically calculate ADR, SDR, and SI from a multi-language colonoscopy report. It could be an important tool for improving colonoscopy quality and clinical decision support. Clinical Trial: This study was approved by the Institutional Review Board of SNUH (IRB 1909-093-670).


 Citation

Please cite as:

Bae JH, Han HW, Yang SY, Song G, Sa S, Chung GE, Seo JY, Jin EH, Kim HC, An D

Natural Language Processing for Assessing Quality Indicators in Free-Text Colonoscopy and Pathology Reports: Development and Usability Study

JMIR Med Inform 2022;10(4):e35257

DOI: 10.2196/35257

PMID: 35436226

PMCID: 9055472

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.