JMIR Preprints #67489: Clinical Management of Wasp Stings Using Large Language Models: A Cross-sectional Evaluation Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Clinical Management of Wasp Stings Using Large Language Models: A Cross-sectional Evaluation Study

Xianyi Yang;
Wei Pan;
Shuman Zhang;
Yonghong Wang;
Zhenglin Quan;
Yanxia Zhu;
Zhicheng Fang

ABSTRACT

Background:

Wasp stings pose a significant global public health challenge, particularly in tropical and subtropical regions. With the rapid advancement of artificial intelligence, large language models (LLMs) are increasingly being applied in healthcare settings. This study systematically evaluates the performance of four LLMs in the clinical management of wasp sting incidents.

Objective:

The goal is to assess their accuracy and ability to handle different aspects of wasp sting management, from basic knowledge to complex decision-making, to help improve AI models and clinical protocols for better healthcare.

Methods:

We employed a cross-sectional design to evaluate four LLMs: ERNIE Bot 3.5, ERNIE Bot 4.0, Claude, and ChatGPT 4.0. Fifty standardized questions were developed, covering ten critical areas of wasp sting management and 20 real-world case-based clinical scenarios. Eight experts in wasp sting management scored the AI models' responses for accuracy and completeness using a 5-point Likert scale. We used the Wilcoxon signed-rank test to compare differences between models and applied Kendall's coefficient of concordance to assess the consistency of expert ratings.

Results:

Our findings indicate that Claude demonstrated superior performance in terms of accuracy and completeness, significantly outperforming other models. ChatGPT 4.0 ranked second, while ERNIE Bot 3.5 and 4.0 exhibited weaker performance. Claude showed a clear advantage in complex decision-making categories such as complication management and severity assessment. In 20 specific clinical scenarios, Claude's performance was also significantly better than ERNIE Bot 3.5 (p<0.001). Analysis of expert rating consistency revealed moderate agreement.

Conclusions:

Claude and ChatGPT 4.0 have demonstrated significant potential in the clinical management of wasp stings, particularly in providing comprehensive information and assisting with complex decision-making. This study offers valuable insights for optimizing AI models and improving wasp sting management protocols. We recommend selecting the appropriate AI model based on specific application scenarios.

Citation

Please cite as:

Yang X, Pan W, Zhang S, Wang Y, Quan Z, Zhu Y, Fang Z

Clinical Management of Wasp Stings Using Large Language Models: Cross-Sectional Evaluation Study

J Med Internet Res 2025;27:e67489

DOI: 10.2196/67489

PMID: 40466102

PMCID: 12177424

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Oct 13, 2024

Date Accepted: Apr 29, 2025

Clinical Management of Wasp Stings Using Large Language Models: A Cross-sectional Evaluation Study

ABSTRACT

Citation

Copyright