Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR AI

Date Submitted: Nov 23, 2022
Open Peer Review Period: Nov 23, 2022 - Jan 18, 2023
Date Accepted: Mar 31, 2023
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Detecting Ground Glass Opacity Features in Patients With Lung Cancer: Automated Extraction and Longitudinal Analysis via Deep Learning–Based Natural Language Processing

Lee K, Liu Z, Chandran U, Kalsekar I, Laxmanan B, Higashi M, Jun T, Ma M, Li M, Mai Y, Gilman C, Wang T, Ai L, Aggarwal P, Pan Q, Oh W, Stolovitzky G, Schadt E, Wang X

Detecting Ground Glass Opacity Features in Patients With Lung Cancer: Automated Extraction and Longitudinal Analysis via Deep Learning–Based Natural Language Processing

JMIR AI 2023;2:e44537

DOI: 10.2196/44537

PMID: 38875565

PMCID: 11041451

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Automated Extraction and Longitudinal Analysis of Ground Glass Opacity Features in Lung Cancer Patients Powered by Deep Learning-based Natural Language Processing

  • Kyeryoung Lee; 
  • Zongzhi Liu; 
  • Urmila Chandran; 
  • Iftekhar Kalsekar; 
  • Balaji Laxmanan; 
  • Mitchell Higashi; 
  • Tomi Jun; 
  • Meng Ma; 
  • Minghao Li; 
  • Yun Mai; 
  • Christopher Gilman; 
  • Tongyu Wang; 
  • Lei Ai; 
  • Parag Aggarwal; 
  • Qi Pan; 
  • William Oh; 
  • Gustavo Stolovitzky; 
  • Eric Schadt; 
  • Xiaoyan Wang

ABSTRACT

Background:

Ground-glass opacities (GGOs) appearing in computed tomography (CT) scans may indicate potential lung malignancy. Proper management of GGOs based on their features can prevent lung cancer (LCA) development. Electronic health records (EHRs) are rich sources of information on GGO nodules and their granular features, but most of the valuable information is embedded in unstructured clinical notes

Objective:

To develop, test, and validate a deep learning-based natural language processing (NLP) tool that automatically extracts GGO features to inform the longitudinal trajectory of GGO status from large-scale radiology notes.

Methods:

We developed a bidirectional-long-short-term memory with a conditional-random-field-based deep-learning NLP pipeline to extract GGO and granular features of GGO retrospectively from radiology notes of 13,216 lung cancer patients. We evaluated the pipeline with quality assessments and cohort characterization was analyzed on the distribution of nodule features longitudinally to assess changes in size and solidity over time.

Results:

Our NLP pipeline, built upon the GGO ontology we developed, achieved 95-100% precision, 89-100% recall, and 92-100% F1 scores on different GGO features. We deployed this GGO NLP model to extract and structure comprehensive characteristics of GGOs from 29,496 radiology notes of 4,521 lung cancer patients. Longitudinal analysis revealed that size increased in 17.5% of patients, decreased in 15.1%, and remained unchanged in 67.4% in their last note compared to the first note. Among 1,127 patients who had longitudinal radiology notes of GGO status, 815 patients (72.3%) were reported to have stable status and 259 patients (23%) had increased/progressed status in the subsequent notes.

Conclusions:

Our deep learning-based NLP pipeline can automatically extract granular GGO features at scale from EHRs when such information is documented in radiology notes and inform the natural history of GGO, which opens the way for a new paradigm in lung cancer prevention and early detection.


 Citation

Please cite as:

Lee K, Liu Z, Chandran U, Kalsekar I, Laxmanan B, Higashi M, Jun T, Ma M, Li M, Mai Y, Gilman C, Wang T, Ai L, Aggarwal P, Pan Q, Oh W, Stolovitzky G, Schadt E, Wang X

Detecting Ground Glass Opacity Features in Patients With Lung Cancer: Automated Extraction and Longitudinal Analysis via Deep Learning–Based Natural Language Processing

JMIR AI 2023;2:e44537

DOI: 10.2196/44537

PMID: 38875565

PMCID: 11041451

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.