Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Dermatology

Date Submitted: May 23, 2023
Date Accepted: Oct 30, 2023

The final, peer-reviewed published version of this preprint can be found here:

Evaluation of ChatGPT Dermatology Responses to Common Patient Queries

Ferreira AL, Chu B, Grant-Kels JM, Ogunleye T, Lipoff JB

Evaluation of ChatGPT Dermatology Responses to Common Patient Queries

JMIR Dermatol 2023;6:e49280

DOI: 10.2196/49280

PMID: 37976093

PMCID: 10692871

Evaluation of ChatGPT Dermatology Responses to Common Patient Queries

  • Alana L Ferreira; 
  • Brian Chu; 
  • Jane M Grant-Kels; 
  • Temitayo Ogunleye; 
  • Jules B Lipoff

ABSTRACT

Background:

The chat-based artificial intelligence (AI) service ChatGPT has gained over 100 million users given its impressive responses to complex queries, and it is likely that patients are using it regularly.

Objective:

We aimed to assess the appropriateness of responses to common dermatologic patient questions generated by ChatGPT using GPT-4.

Methods:

Three experienced dermatologists designed 31 questions covering common skin conditions to test ChatGPT's response appropriateness. The questions were categorized into six groups: acne, atopic dermatitis, alopecia, psoriasis, rosacea, skin cancer, and miscellaneous. ChatGPT Plus was used to access GPT-4. In April 2023, we queried ChatGPT with each question three times, yielding 93 responses. A new chat was initiated for each question to avoid prior context bias. The same three dermatologists independently assessed the responses, grading them as "appropriate" or "inappropriate" based on their expertise.

Results:

ChatGPT generated 84.9% (79/93) appropriate and 15% (14/93) inappropriate responses. Appropriateness was assessed according to majority opinion of the three dermatologist reviewers. 16.1% (5/31) of questions had an overall inappropriate response average, with at least 2/3 dermatologists rating 2/3 of responses as inappropriate.

Conclusions:

Our results highlight that ChatGPT should not replace professional medical advice and should remain a supplementary informational tool for now. As AI advances, dermatologists must engage in developing clinical and patient-facing AI tools, considering public health and patient safety implications. Dermatologists should expect patients to use ChatGPT for their skin-related questions and be familiar with the types of responses generated. Clinical Trial: N/A


 Citation

Please cite as:

Ferreira AL, Chu B, Grant-Kels JM, Ogunleye T, Lipoff JB

Evaluation of ChatGPT Dermatology Responses to Common Patient Queries

JMIR Dermatol 2023;6:e49280

DOI: 10.2196/49280

PMID: 37976093

PMCID: 10692871

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.