JMIR Preprints #66796: Performance of three conversational generative AI models for computing maximum safe doses of local anesthetics: a comparative analysis.

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Performance of three conversational generative AI models for computing maximum safe doses of local anesthetics: a comparative analysis.

Melanie Suppan;
Pietro Elias Fubini;
Alexandra Stefani;
Mia Gisselbaek;
Caroline Flora Samer;
Georges Louis Savoldelli

ABSTRACT

Background:

Generative artificial intelligence (AI) is showing great promise as a tool to optimize decision-making across various fields, including medicine. In anesthesiology, accurately calculating maximum safe doses of local anesthetics (LAs) is crucial to prevent complications such as local anesthetic systemic toxicity (LAST). Current methods for determining LA dosage are largely based on empirical guidelines and clinician experience, which can result in significant variability and dosing errors. Generative AI models may offer a solution, as they could be capable of integrating all the relevant parameters and suggest adequate LA doses.

Objective:

This study aimed to evaluate the efficacy and safety of 3 generative AI models—ChatGPT, Copilot, and Gemini—in calculating maximum safe LA doses, with the goal of determining their potential utility in clinical practice.

Methods:

A comparative analysis was conducted using a 51-question questionnaire designed to assess LA dose calculation across 10 simulated clinical vignettes. The responses generated by ChatGPT, Copilot, and Gemini were compared to reference doses calculated using a scientifically validated set of rules. Quantitative evaluations involved comparing AI-generated doses to these reference doses, while qualitative assessments were conducted by independent reviewers using a 5-point Likert scale.

Results:

All 3 AI models—Gemini, ChatGPT, and Copilot—completed the questionnaire and demonstrated a basic understanding of LA dose calculation principles, but their performance in providing safe doses varied significantly. Gemini frequently avoided proposing any specific dose, instead recommending consultation with a specialist. When it did provide dose ranges, they often exceeded safe limits by 140±103% in cases involving mixtures. ChatGPT provided unsafe doses in 90% of cases, exceeding safe limits by 198±196%. Copilot's recommendations were unsafe in 67% of cases, exceeding limits by 217±239%. Qualitative assessments rated Gemini as "fair" and both ChatGPT and Copilot as "poor".

Conclusions:

Generative AI models like Gemini, ChatGPT, and Copilot currently lack the accuracy and reliability needed for safe LA dose calculation. Their poor performance suggests that they should not be used as decision-making tools for this purpose. Until more reliable AI-driven solutions are developed and validated, clinicians should rely on their expertise, experience, and a careful assessment of individual patient factors to guide LA dosing and ensure patient safety. Clinical Trial: N/A

Citation

Please cite as:

Suppan M, Fubini PE, Stefani A, Gisselbaek M, Samer CF, Savoldelli GL

Performance of 3 Conversational Generative Artificial Intelligence Models for Computing Maximum Safe Doses of Local Anesthetics: Comparative Analysis

JMIR AI 2025;4:e66796

DOI: 10.2196/66796

PMID: 40605845

PMCID: 12223683

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR AI

Date Submitted: Sep 23, 2024

Date Accepted: Apr 1, 2025

Performance of three conversational generative AI models for computing maximum safe doses of local anesthetics: a comparative analysis.

ABSTRACT

Citation

Copyright