JMIR Preprints #72638: Enhancing Pulmonary Disease Prediction Using Large Language Models with Feature Summarization and Hybrid Retrieval-Augmented Generation: Multicenter Methodological Study based on Radiology Report

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Enhancing Pulmonary Disease Prediction Using Large Language Models with Feature Summarization and Hybrid Retrieval-Augmented Generation: Multicenter Methodological Study based on Radiology Report

Ronghao Li;
Shuai Mao;
Congmin Zhu;
Yingliang Yang;
Chunting Tan;
Li Li;
Xiangdong Mu;
Honglei Liu;
Yuqing Yang

ABSTRACT

Background:

The rapid advancements in natural language processing (NLP), particularly the development of large language models (LLMs), have opened new avenues for managing complex clinical text data. However, the inherent complexity and specificity of medical texts present significant challenges for the practical application of prompt engineering in diagnostic tasks.

Objective:

In this study, LLMs with new prompt engineering technology are explored to enhance model interpretability and improve the prediction performance of pulmonary disease based on traditional deep learning model.

Methods:

A retrospective dataset including 2965 chest CT radiology reports was constructed. The reports were from four cohorts, healthy individuals, patients with pulmonary tuberculosis, lung cancer, and pneumonia. Then a novel prompt engineering strategy that integrates feature summarization(F-Sum), chain of thought (CoT) reasoning, and a hybrid retrieval-augmented generation (RAG) framework was proposed. A feature summarization approach, leveraging TF-IDF and K-means clustering, was employed to extract and distill key radiological findings related to three diseases. Simultaneously, the hybrid RAG framework combined dense and sparse vector representations to enhance LLMs’ comprehension of disease-related text. Three state-of-the-art LLMs, GLM-4-Plus, GLM-4-air, and GPT-4o, were integrated with the prompt strategy to evaluate the efficiency in recognizing pneumonia, tuberculosis, and lung cancer. The traditional deep learning model, BERT, was also compared to assess the superiority of LLMs. Finally, the proposed method was tested on an external validation dataset consisted of 343 chest CT report from another hospital.

Results:

Comparing with BERT-based prediction model and various other prompt engineering techniques, our method with GLM-4-Plus achieved the best performance on test dataset, attaining an F1 score of 0.8887 and accuracy of 0.8947. On external validation dataset, F1 score (0.8631) and accuracy (0.9167) of the proposed method with GPT-4o were the highest. Compared to the popular strategy with manually selected typical samples (Few-shot) and CoT designed by doctor (F1:0.8294, Accuracy: 0.8342), the proposed method that summarized disease characteristics (F-Sum) based on LLM and automatically generated COT performed better (F1: 0.8870, Accuracy: 0.8974). Although the BERT-based model got similar results on test dataset (F1: 0.8514, Accuracy: 0.8763), its predictive performance significantly decreases on the external validation set (F1: 0.4836, Accuracy: 0.7833).

Conclusions:

These findings highlight the potential of LLMs to revolutionize pulmonary disease prediction, particularly in resource-constrained settings, by surpassing traditional models in both accuracy and flexibility. The proposed prompt engineering strategy not only improves predictive performance but also enhances the adaptability of LLMs in complex medical contexts, offering a promising tool for advancing disease diagnosis and clinical decision making.

Citation

Please cite as:

Li R, Mao S, Zhu C, Yang Y, Tan C, Li L, Mu X, Liu H, Yang Y

Enhancing Pulmonary Disease Prediction Using Large Language Models With Feature Summarization and Hybrid Retrieval-Augmented Generation: Multicenter Methodological Study Based on Radiology Report

J Med Internet Res 2025;27:e72638

DOI: 10.2196/72638

PMID: 40499132

PMCID: 12176309

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Feb 13, 2025

Date Accepted: Apr 22, 2025

Enhancing Pulmonary Disease Prediction Using Large Language Models with Feature Summarization and Hybrid Retrieval-Augmented Generation: Multicenter Methodological Study based on Radiology Report

ABSTRACT

Citation

Copyright