JMIR Preprints #82579: Initial Insights into an Institutional Secure Large Language model for MRI Examination Requests: Retrospective Review

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Initial Insights into an Institutional Secure Large Language model for MRI Examination Requests: Retrospective Review

James Thomas Patrick Decourcy Hallinan;
Naomi Wenxin Leow;
Yi Xian Low;
Aric Lee;
Wilson Ong;
Matthew Ding Zhou Chan;
Ganakirthana Kalpenya Devi;
Stephanie Shengjie He;
Daniel De-Liang Loh;
Desmond Shi Wei Lim;
Xi Zhen Low;
Mei Chin Lim;
Clement Yong;
Jonathan Sng;
Ee Chin Teo;
Jiong Hao Tan;
Naresh Kumar;
Andrew Makmur;
Yonghan Ting

ABSTRACT

Background:

Incomplete clinical details on MRI examination requests (MERs) can lead to sub-optimal protocol selection. An institutional secure large language model (sLLM) with access to the electronic medical record (EMR) may improve request completeness and protocol accuracy across multiple MRI subspecialties.

Objective:

To compare clinician MERs with sLLM-augmented MERs for information quality and to evaluate protocoling accuracy of the sLLM versus board-certified radiologists across body, musculoskeletal, and neuroradiology MRI.

Methods:

This retrospective study included 608 consecutive MRI examinations performed between September 2023 and July 2024 (body 206, musculoskeletal 203, neuroradiology 199). The cohort comprised 528 patients (mean age 51.2 years ± 19.2, range 4–93; 279 women [52.8%], 249 men [47.2%]). MERs without EMR access were excluded. A privately hosted Anthropic Claude 3.5 model (temperature 0) augmented each MER with salient EMR data and, via rule-based parsing, recommended region/coverage and contrast use. Two experienced radiologists established a consensus reference standard. Two board-certified general radiologists (Rad 3, Rad 4) and the sLLM were compared with this standard. Clinical-information quality was graded using the Reason-for-Exam Imaging Reporting and Data System (RI-RADS). Inter-rater reliability was quantified with Gwet’s AC1. Paired accuracies were compared with McNemar testing to determine if there was a statistically significant difference.

Results:

Inter-reader agreement for RI-RADS was almost perfect for sLLM-augmented MERs (AC1 0.97; 95% CI 0.94–0.99) and moderate for clinician MERs (AC1 0.43; 95% CI 0.34–0.52). Limited or deficient clinical information (RI-RADS C/D) fell to 0–0.7% with sLLM augmentation versus 5.2–20.4% for clinician MERs. Overall protocol accuracy was 566/608 (93.1%; 95% CI 89.6–96.6) for the sLLM, 556/608 (91.4%; 95% CI 87.6–95.3) for Rad 3, and 560/608 (92.1%; 95% CI 88.4–95.8) for Rad 4 (sLLM vs Rad 3 p=.23; vs Rad 4 p=.40). Region/coverage accuracy was similar (sLLM 95.2%, Rad 3 96.2%, Rad 4 94.2%; p=.46 and p=.36). Contrast decisions were more accurate using the sLLM at 574/608 (94.4%; 95% CI 91.3–97.5) versus Rad 3 at 560/608 (92.1%; 95% CI 88.4–95.8; p=.027) and not significantly different to Rad 4 at 565/608 (92.9%; 95% CI 89.4–96.4; p=.16). Subspecialty analyses showed similar patterns, with the sLLM outperforming Rad 4 for musculoskeletal MRI contrast decisions (96.6% vs 91.1%; p=.006) and matching readers elsewhere. Manual review indicated that sLLM improvements arose from EMR details not listed on the MER (infection/inflammation, tumor history, prior surgery). No clinically significant hallucinations were identified.

Conclusions:

Across body, musculoskeletal, and neuroradiology MRI, secure LLM-augmented examination requests had improved clinical context and enhanced contrast selection while matching general radiologists for region/coverage. Integrating secure LLMs into routine vetting workflows may reduce manual workload and standardize protocoling.

Citation

Please cite as:

Hallinan JTPD, Leow NW, Low YX, Lee A, Ong W, Chan MDZ, Devi GK, He SS, Loh DDL, Lim DSW, Low XZ, Lim MC, Yong C, Sng J, Teo EC, Tan JH, Kumar N, Makmur A, Ting Y

Initial Insights Into an Institutional Secure Large Language Model for Magnetic Resonance Imaging Examination Requests: Retrospective Study

J Med Internet Res 2026;28:e82579

DOI: 10.2196/82579

PMID: 41945643

PMCID: 13055936

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Aug 25, 2025

Open Peer Review Period: Aug 25, 2025 - Oct 20, 2025

Date Accepted: Feb 24, 2026

(closed for review but you can still tweet)

Initial Insights into an Institutional Secure Large Language model for MRI Examination Requests: Retrospective Review

ABSTRACT

Citation

Copyright