JMIR Preprints #67706: Extracting Pulmonary Embolism Diagnoses from Radiology Impressions Using GPT-4o: A Large Language Model Evaluation Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Extracting Pulmonary Embolism Diagnoses from Radiology Impressions Using GPT-4o: A Large Language Model Evaluation Study

Mohammed Mahyoub;
Kacie Dougherty;
Ajit Shukla

ABSTRACT

Background:

Pulmonary embolism (PE) is a critical condition requiring rapid diagnosis to reduce mortality. Extracting PE diagnoses from radiology reports manually is time-consuming, highlighting the need for automated solutions. Advances in natural language processing (NLP), especially transformer models like GPT-4o, offer promising tools to improve diagnostic accuracy and workflow efficiency in clinical settings.

Objective:

To develop an advanced NLP system using GPT-4o for the automatic extraction of PE diagnoses from radiology report impressions, enhancing clinical decision-making and workflow efficiency.

Methods:

Two approaches were developed and evaluated: a fine-tuned Clinical Longformer as a baseline model, and a GPT-4o-based extractor. The Clinical Longformer was trained on a dataset of 1,000 radiology report impressions and validated on a separate set of 200 samples, while the GPT-4o extractor was validated using the same 200-sample set. Post-deployment performance was further assessed on an additional 500 operational records to evaluate model efficacy in a real-world setting.

Results:

GPT-4o outperformed the Clinical Longformer, achieving 100% sensitivity, specificity, and F1 score across both training and post-deployment evaluations. This high level of accuracy supports a reduction in manual review, streamlining clinical workflows and improving diagnostic precision.

Conclusions:

The GPT-4o model provides an effective solution for the automatic extraction of PE diagnoses from radiology reports, offering a reliable tool that aids timely and accurate clinical decision-making. This approach has the potential to significantly improve patient outcomes by expediting diagnosis and treatment pathways for critical conditions like PE. Clinical Trial: NA

Citation

Please cite as:

Mahyoub M, Dougherty K, Shukla A

Extracting Pulmonary Embolism Diagnoses From Radiology Impressions Using GPT-4o: Large Language Model Evaluation Study

JMIR Med Inform 2025;13:e67706

DOI: 10.2196/67706

PMID: 40203306

PMCID: 12018862

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Oct 18, 2024

Date Accepted: Mar 13, 2025

Extracting Pulmonary Embolism Diagnoses from Radiology Impressions Using GPT-4o: A Large Language Model Evaluation Study

ABSTRACT

Citation

Copyright