JMIR Preprints #88618: The Power of Multimodality: Comparative Analysis of Multimodal Large Language Models, Unimodal ChatGPT-5.0, and Human Clinical Experts on Wound Care Certification Examination

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

The Power of Multimodality: Comparative Analysis of Multimodal Large Language Models, Unimodal ChatGPT-5.0, and Human Clinical Experts on Wound Care Certification Examination

Mete Ucdal;
Melike Elif Celik;
Guliz Evik;
Saniye Beyza Kuru;
Saadet Ozer;
Sultan Gungor

ABSTRACT

Background:

Multimodal large language models (MLLMs) capable of integrating visual and textual information represent a promising advancement for clinical applications requiring image interpretation. Wound care assessment, which demands simultaneous analysis of wound photographs and clinical data, provides an ideal domain to evaluate multimodal versus unimodal artificial intelligence capabilities against human expertise.

Objective:

To compare the performance of MLLMs, unimodal ChatGPT-5.0, and human clinical experts on a standardized wound care certification examination.

Objective:

To compare the performance of MLLMs, unimodal ChatGPT-5.0, and human clinical experts on a standardized wound care certification examination.

Methods:

This cross-sectional comparative study evaluated three participant groups on a 25-question wound care certification examination spanning four clinical domains (Diagnosis, Treatment, Complication Management, Wound Subtype Knowledge). Participants included three MLLMs (Med-PaLM 2, LLaVA-Med, BioGPT), one unimodal LLM (ChatGPT-5.0), and four human clinical experts (General Surgeon, Wound Care Nurse, two Internal Medicine Physicians). Statistical analyses included one-way ANOVA with Tukey's post-hoc tests and domain-specific Kruskal-Wallis comparisons

Results:

Human experts achieved the highest accuracy (86.0%±9.1%), followed by MLLMs (78.7%±12.2%), while ChatGPT-5.0 achieved 64.0%, failing the 70% certification threshold. Significant overall group differences were observed (F(2,5)=8.42, p=0.018, η²=0.74). MLLMs significantly outperformed ChatGPT-5.0 (difference=14.7 percentage points, p=0.032, Cohen's d=1.38), with the multimodal advantage most pronounced in visually-dependent domains: Diagnosis (81% vs 43%, p=0.008) and Complication Management (72% vs 50%, p=0.034). No multimodal advantage was observed for text-based Wound Subtype Knowledge (both 67%). Med-PaLM 2 achieved 92% accuracy, matching the Wound Care Nurse, while the General Surgeon achieved the highest overall performance (96%).

Conclusions:

MLLMs demonstrate significant performance advantages over unimodal AI in wound care assessment, particularly for visually-dependent clinical tasks. While human experts with specialized wound care experience maintain overall superiority, top-performing MLLMs approach expert-level accuracy, supporting their potential role as clinical decision-support tools

Citation

Please cite as:

Ucdal M, Celik ME, Evik G, Kuru SB, Ozer S, Gungor S

The Power of Multimodality in Multimodal Large Language Models, Unimodal ChatGPT 5.0, and Human Clinical Experts on a Wound Care Certification Examination: Cross-Sectional Comparative Study

JMIR Form Res 2026;10:e88618

DOI: 10.2196/88618

PMID: 33877775

PMCID: 13120536

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Formative Research

Date Submitted: Nov 28, 2025

Open Peer Review Period: Dec 1, 2025 - Jan 26, 2026

Date Accepted: Feb 23, 2026

(closed for review but you can still tweet)

The Power of Multimodality: Comparative Analysis of Multimodal Large Language Models, Unimodal ChatGPT-5.0, and Human Clinical Experts on Wound Care Certification Examination

ABSTRACT

Citation

Copyright