JMIR Preprints #49280: Evaluation of ChatGPT Dermatology Responses to Common Patient Queries

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Evaluation of ChatGPT Dermatology Responses to Common Patient Queries

Alana L Ferreira;
Brian Chu;
Jane M Grant-Kels;
Temitayo Ogunleye;
Jules B Lipoff

ABSTRACT

Background:

The chat-based artificial intelligence (AI) service ChatGPT has gained over 100 million users given its impressive responses to complex queries, and it is likely that patients are using it regularly.

Objective:

We aimed to assess the appropriateness of responses to common dermatologic patient questions generated by ChatGPT using GPT-4.

Methods:

Three experienced dermatologists designed 31 questions covering common skin conditions to test ChatGPT's response appropriateness. The questions were categorized into six groups: acne, atopic dermatitis, alopecia, psoriasis, rosacea, skin cancer, and miscellaneous. ChatGPT Plus was used to access GPT-4. In April 2023, we queried ChatGPT with each question three times, yielding 93 responses. A new chat was initiated for each question to avoid prior context bias. The same three dermatologists independently assessed the responses, grading them as "appropriate" or "inappropriate" based on their expertise.

Results:

ChatGPT generated 84.9% (79/93) appropriate and 15% (14/93) inappropriate responses. Appropriateness was assessed according to majority opinion of the three dermatologist reviewers. 16.1% (5/31) of questions had an overall inappropriate response average, with at least 2/3 dermatologists rating 2/3 of responses as inappropriate.

Conclusions:

Our results highlight that ChatGPT should not replace professional medical advice and should remain a supplementary informational tool for now. As AI advances, dermatologists must engage in developing clinical and patient-facing AI tools, considering public health and patient safety implications. Dermatologists should expect patients to use ChatGPT for their skin-related questions and be familiar with the types of responses generated. Clinical Trial: N/A

Citation

Please cite as:

Ferreira AL, Chu B, Grant-Kels JM, Ogunleye T, Lipoff JB

Evaluation of ChatGPT Dermatology Responses to Common Patient Queries

JMIR Dermatol 2023;6:e49280

DOI: 10.2196/49280

PMID: 37976093

PMCID: 10692871

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Dermatology

Date Submitted: May 23, 2023

Date Accepted: Oct 30, 2023

Evaluation of ChatGPT Dermatology Responses to Common Patient Queries

ABSTRACT

Citation

Copyright