Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently accepted at: Journal of Medical Internet Research

Date Submitted: Dec 18, 2024
Date Accepted: Apr 17, 2025

This paper has been accepted and is currently in production.

It will appear shortly on 10.2196/70257

The final accepted version (not copyedited yet) is in this tab.

Rule-based natural language processing in oncology clinical decision support: a systematic review and meta-analysis

  • Stephen Ali; 
  • Garikai Kungwengwe; 
  • Dafydd Hughes; 
  • Thomas Dobbs; 
  • Hayley Hutchings; 
  • Iain Whitaker

Background:

The exponential growth of healthcare data has driven increased use of natural language processing (NLP) for clinical decision support (CDS) in oncology. Although machine learning models have received considerable attention, the diagnostic performance and clinical utility of rule-based NLP systems remain less explored, particularly when compared with human assessments.

Objective:

This systematic review and meta-analysis evaluates the diagnostic accuracy of rule-based NLP systems for oncology CDS. We aimed to: (i) compute pooled sensitivity, specificity, and AUC; (ii) compare performance across tumour types and clinical tasks; and (iii) benchmark rule-based algorithms against clinician assessments. We hypothesised that rule-based systems achieve high accuracy in structured tasks but vary by tumour type.

Methods:

A systematic review was conducted by searching EMBASE, MEDLINE, CINAHL, the Cochrane Library, Web of Science, and the Collection of Computer Science Bibliographies for studies published up to 13th April 2020. Eligible studies applied rule-based NLP for cancer-related CDS with human comparators. Two reviewers independently screened records, extracted data via Covidence, and assessed study quality using TRIPOD criteria. A bivariate random-effects meta-analysis estimated pooled sensitivity, specificity, and AUC. Subgroup analyses compared performance across tumour types and clinical tasks. Univariate meta-regressions evaluated the influence of publication year and dataset size, and Deek’s regression test assessed publication bias.

Results:

Of 3,223 screened records, 89 studies met inclusion criteria, spanning publication years 1993–2020 and analysing over 1.2 million patient records. Breast cancer was the most frequently studied (24.7%), followed by multiple cancer types (14.6%), colorectal (12.4%), lung (10.1%), and prostate cancer (9.0%).Meta-analysis of 35 studies yielded a pooled sensitivity of 0.96 (95% CI: 0.93–0.97), specificity of 0.98 (95% CI: 0.95–0.99), and an AUC of 0.99. Subgroup analysis showed breast cancer algorithms achieved sensitivity and specificity of 0.98, while malignant melanoma algorithms reported lower sensitivity (0.49) but high specificity (0.93). Pancreatic cancer algorithms performed well (sensitivity 0.97, specificity 0.98). Most studies used retrospective designs, relying on electronic health records and pathology reports. Quality assessment scores ranged from 48% to 92% adherence to TRIPOD criteria. Risk of bias assessment rated 6 studies (6.7%) as high quality, 57 (64.0%) as fair, and 26 (29.2%) as low. Heterogeneity was moderate to high (I²: 59%–74%), with no significant association between performance and publication year or dataset size. Deek’s test indicated no publication bias (p = 0.73).

Conclusions:

Rule-based NLP systems exhibit high diagnostic accuracy for oncology CDS, though performance varies by tumour type and clinical context. Limitations include study heterogeneity, variable reporting standards, and reliance on retrospective data. These findings highlight the need for standardised reporting, direct comparisons with machine learning approaches, and prospective validation to enhance clinical applicability.

ClinicalTrial:

Prospero (CRD42020180676).


 Citation

Please cite as:

Ali S, Kungwengwe G, Hughes D, Dobbs T, Hutchings H, Whitaker I

Rule-based natural language processing in oncology clinical decision support: a systematic review and meta-analysis

Journal of Medical Internet Research. 17/04/2025:70257 (forthcoming/in press)

DOI: 10.2196/70257

URL: https://preprints.jmir.org/preprint/70257

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.