JMIR Preprints #75052: Predicting 30-Day Postoperative Mortality and American Society of Anesthesiologists Physical Status Using Retrieval-Augmented Large Language Models: Development and Validation Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Predicting 30-Day Postoperative Mortality and American Society of Anesthesiologists Physical Status Using Retrieval-Augmented Large Language Models: Development and Validation Study

Ying-Hao Chen;
Shanq-Jang Ruan;
Pei-fu Chen

ABSTRACT

Background:

Accurately assessing perioperative risk is critical for informed surgical planning and patient safety. However, current prediction models often rely solely on structured data and overlook the nuanced clinical reasoning embedded in free-text preoperative notes. Recent advances in large language models (LLMs) have opened new opportunities for harnessing unstructured clinical data, yet their application in perioperative prediction remains limited by concerns about factual accuracy. Retrieval-augmented generation (RAG) offers a promising solution—enhancing LLM performance by grounding outputs in authoritative medical sources, potentially improving both predictive accuracy and clinical interpretability.

Objective:

This study aimed to investigate whether integrating LLMs with RAG can improve the prediction of 30-day postoperative mortality and American Society of Anesthesiologists physical status classification using unstructured preoperative clinical notes.

Methods:

We conducted a retrospective cohort study using over 24,491 medical records from a tertiary medical center, including preoperative anesthesia assessments, discharge summaries, and surgical information. To extract clinical insights from free-text data, we employed the LLaMA 3.1-8B language model with retrieval-augmented generation (RAG), using MedEmbed for text embedding and Miller’s Anesthesia as the primary retrieval source. We systematically evaluated model performance under various configurations—embedding models, chunk sizes, and few-shot prompting—using weighted area under the precision-recall curve (AUPRC) for mortality prediction and micro F1 score for American Society of Anesthesiologists (ASA) classification.

Results:

The LLaMA-RAG model consistently outperformed traditional machine learning baselines. For 30-day postoperative mortality, it achieved the highest AUROC of 0.9570 (95% CI 0.9543–0.9597) and AUPRC of 0.6536 (95% CI 0.6479–0.6593). For ASA classification, it attained the highest micro F1 score of 0.8409 (95% CI 0.8238–0.8551). Notably, the model demonstrated exceptional sensitivity in identifying rare but high-risk cases, such as ASA Class 5 patients and postoperative deaths.

Conclusions:

The LLaMA-RAG model significantly improved prediction of postoperative mortality and ASA classification, especially for rare high-risk cases. By grounding outputs in domain knowledge, retrieval-augmented generation enhanced both accuracy and interpretability.

Citation

Please cite as:

Chen YH, Ruan SJ, Chen Pf

Predicting 30-Day Postoperative Mortality and American Society of Anesthesiologists Physical Status Using Retrieval-Augmented Large Language Models: Development and Validation Study

J Med Internet Res 2025;27:e75052

DOI: 10.2196/75052

PMID: 40460423

PMCID: 12174870

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Mar 27, 2025

Open Peer Review Period: Mar 27, 2025 - Apr 11, 2025

Date Accepted: May 12, 2025

(closed for review but you can still tweet)

Predicting 30-Day Postoperative Mortality and American Society of Anesthesiologists Physical Status Using Retrieval-Augmented Large Language Models: Development and Validation Study

ABSTRACT

Citation

Copyright