Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Formative Research

Date Submitted: Mar 20, 2022
Date Accepted: Dec 22, 2022

The final, peer-reviewed published version of this preprint can be found here:

How Natural Language Processing Can Aid With Pulmonary Oncology Tumor Node Metastasis Staging From Free-Text Radiology Reports: Algorithm Development and Validation

Puts S, Nobel M, Zegers C, Bermejo I, Robben S, Dekker A

How Natural Language Processing Can Aid With Pulmonary Oncology Tumor Node Metastasis Staging From Free-Text Radiology Reports: Algorithm Development and Validation

JMIR Form Res 2023;7:e38125

DOI: 10.2196/38125

PMID: 36947118

PMCID: 10131747

How Natural Language processing can aid with pulmonary oncology TNM staging from free text radiology reports

  • Sander Puts; 
  • Martijn Nobel; 
  • Catharina Zegers; 
  • Iñigo Bermejo; 
  • Simon Robben; 
  • Andre Dekker

ABSTRACT

Background:

Natural Language Processing (NLP) is thought to be a promising solution to extract and store concepts from free text in a structured manner for data mining purposes. This is also true for radiology reports, which still consist mostly out of free text. Accurate and complete reports are very important for clinical decision support, for instance in oncological staging. As such, NLP can be a tool to structure the content of the radiology report, thereby increasing the report’s value.

Objective:

This study describes the implementation and validation of an N-stage classifier for pulmonary oncology. It is based on free text radiological chest Computed Tomography (CT) reports according to the Tumor Node Metastasis (TNM) classification, which has been added to the already existing T-stage classifier to create a combined TN-stage classifier.

Methods:

SpaCy, PyContextNLP and Regular Expressions (RegEx) were used for proper information extraction, after additional rules were set to accurately extract N-stage.

Results:

The overall TN-stage classifier accuracy scores were 0.84 and 0.85 for, respectively, the training (n = 95) and validation (n = 97) sets. This is comparable to outcomes of the T-stage classifier (0.87-0.92).

Conclusions:

This study shows NLP has potential in classifying pulmonary oncology from free text radiological reports according to the TNM classification system as both the T and N-stages can be extracted with high accuracy.


 Citation

Please cite as:

Puts S, Nobel M, Zegers C, Bermejo I, Robben S, Dekker A

How Natural Language Processing Can Aid With Pulmonary Oncology Tumor Node Metastasis Staging From Free-Text Radiology Reports: Algorithm Development and Validation

JMIR Form Res 2023;7:e38125

DOI: 10.2196/38125

PMID: 36947118

PMCID: 10131747

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.