JMIR Preprints #64272: ChatGPT and The Suspicion of Skin Cancer, a Diagnostic Accuracy Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

ChatGPT and The Suspicion of Skin Cancer, a Diagnostic Accuracy Study

William Abou Shahla;
Firas Haddad;
Mariana El Hawa;
Dana Saade

ABSTRACT

Background:

While ChatGPT is user-friendly and widely accessible, concerns arise regarding potential delays in diagnosis and false reassurances for patients with suspected skin malignancies.

Objective:

Our study aims to assess the accuracy of AI, specifically ChatGPT, in diagnosing skin malignancies and expressing the urgency to seek medical advice.

Methods:

This diagnostic accuracy study assesses the agreement between dermatologists' final diagnoses and those provided by ChatGPT when patients describe their lesions. Thirty-five patients, suspected of skin cancer (SCC/BCC), provided demographic details and lesion descriptions. Diagnoses were recorded in ChatGPT3.5 and ChatGPT4.0 for analysis.

Results:

Out of 35 lesions suspected by the dermatologist, all were malignant, indicating 100% accuracy. ChatGPT3.5 flagged malignancy in 7 cases (20%), while ChatGPT4.0 did so in 6 cases (17.14%). Consistency was lacking, as only 7 lesions received the same diagnosis from both models. However, both ChatGPT3.5 and ChatGPT4.0 referred patients to dermatologists in all cases.

Conclusions:

Both GPT models performed comparably to each other but were significantly inferior to dermatologists. However, both did not cause delays in referral to a dermatologist. The limitations of these two models include poor accuracy, lack of concordance among each other’s, and reproducibility issues with their answers.

Citation

Please cite as:

Abou Shahla W, Haddad F, El Hawa M, Saade D

ChatGPT and The Suspicion of Skin Cancer, a Diagnostic Accuracy Study

JMIR Preprints. 13/07/2024:64272

DOI: 10.2196/preprints.64272

URL: https://preprints.jmir.org/preprint/64272

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Previously submitted to: JMIR Dermatology (no longer under consideration since Aug 06, 2024)

Date Submitted: Jul 13, 2024

Open Peer Review Period: Aug 6, 2024 - Aug 6, 2024

(closed for review but you can still tweet)

NOTE: This is an unreviewed Preprint

ChatGPT and The Suspicion of Skin Cancer, a Diagnostic Accuracy Study

ABSTRACT

Citation

Copyright