Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: Journal of Medical Internet Research

Date Submitted: Mar 27, 2026
Open Peer Review Period: Apr 9, 2026 - Jun 4, 2026
(currently open for review)

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

A multi-agent LLM framework toward real-world clinical decision-making support in acute ischemic stroke

  • Yuehua Li; 
  • Bi-Cong Yan; 
  • Ruipeng Zhang; 
  • Xinyu Song; 
  • Li Chen; 
  • Zhongzheng Cao

ABSTRACT

Background:

Acute ischemic stroke (AIS) treatment selection requires rapid, guideline-concordant integration of clinical, imaging, and laboratory data, including therapeutic windows, contraindications, stroke severity, and imaging eligibility. This process is complex, expertise-dependent, and vulnerable to safety-critical errors.

Objective:

To develop and validate a structured multi-agent large language model (LLM) framework for real-world AIS decision support, and to assess whether it can improve guideline adherence, safety auditability, and physician decision-making, particularly among junior physicians and non-specialists.

Methods:

We developed a multi-agent LLM workflow that imposed structured outputs and guideline-based reasoning to generate treatment recommendations (intravenous thrombolysis, endovascular thrombectomy, standard medical therapy, or non-AIS/non-stroke) and TOAST subtypes. The framework was evaluated using multicenter retrospective real-world cases, prospectively collected clinical cases, and literature-derived challenging cases. Performance was assessed against clinical reference standards. Safety was assessed by omission and hallucination event rates and clinician-rated usefulness (5-point Likert scale). In a prospective physician study, paired physician-by-case decisions with and without LLM output were analyzed using a binomial generalized linear mixed-effects model with crossed random intercepts for physician and case.

Results:

Framework augmentation improved treatment recommendation accuracy across representative Baichuan, Qwen, DeepSeek, and GPT models. In Group A, accuracy increased from 0.546 to 0.687, 0.574 to 0.697, 0.687 to 0.847, and 0.737 to 0.851, respectively. Similar improvements were observed in Group B (0.595 to 0.684, 0.587 to 0.671, 0.671 to 0.813, and 0.698 to 0.798) and Group C (0.507 to 0.667, 0.618 to 0.674, 0.646 to 0.729, and 0.597 to 0.750). Compared with the standalone model, the augmented framework also showed higher safety scores (4.36 vs 4.02), lower hallucination rates (3.1% vs 4.7%), and lower omission rates (10.3% vs 16.6%). In the prospective physician study, treatment decision accuracy increased from 73.1% to 88.6% with LLM support, with greater gains among junior physicians and non-specialists.

Conclusions:

A structured multi-agent framework improved LLM performance in AIS treatment recommendation and TOAST classification, while providing safer, more auditable decision support. It was also associated with higher physician decision accuracy, with larger gains among less-experienced physicians, suggesting potential to reduce expertise-related disparities in stroke care. Prospective multicenter studies are needed to assess effects on workflow and clinical outcomes. Clinical Trial: Chinese Clinical Trial Registry ChiCTR2400092800; registered November 22, 2024.


 Citation

Please cite as:

Li Y, Yan BC, Zhang R, Song X, Chen L, Cao Z

A multi-agent LLM framework toward real-world clinical decision-making support in acute ischemic stroke

JMIR Preprints. 27/03/2026:96304

DOI: 10.2196/preprints.96304

URL: https://preprints.jmir.org/preprint/96304

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.