JMIR Preprints #47621: The potential of ChatGPT as a self-diagnostic tool in common orthopedic diseases: An exploratory study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

The potential of ChatGPT as a self-diagnostic tool in common orthopedic diseases: An exploratory study

Tomoyuki Kuroiwa;
Aida Sarcon;
Takuya Ibara;
Eriku Yamada;
Akiko Yamamoto;
Kazuya Tsukamoto;
Koji Fujita

ABSTRACT

Background:

Artificial intelligence (AI) has gained tremendous popularity recently, especially the use of natural language processing (NLP). ChatGPT is a state-of-the-art chatbot capable of creating natural conversations using NLP. The use of AI in medicine can have a tremendous impact on healthcare delivery. No study to date has evaluated the accuracy/precision of ChatGPT’s ability to "self-diagnosis".

Objective:

To evaluate ChatGPT’s ability to accurately/precisely "self-diagnosis" common orthopedic diseases.

Methods:

Over a 5-day course, the study authors submitted the same questions to ChatGPT. The conditions evaluated were carpal tunnel syndrome (CTS), cervical myelopathy (CM), lumbar spinal stenosis (LSS), knee osteoarthritis (KOA), and hip osteoarthritis (HOA). Answers were categorized as either "correct", "incorrect", or as a "differential diagnosis". The accuracy, precision, and percentage of correct answers was calculated. Answers were subcategorized into each disease’s name and as a "differential diagnosis". The intra- and inter-examiner variability was calculated via Fleiss-Kappa coefficient. Answers that recommended that the patient seek medical attention were recategorized according to the strength of the recommendation as defined by the study. There were different phrases used, thus the percentages were obtained.

Results:

The percentage of correct answers were 100%, 4%, 96%, 64%, and 68% for CTS, CM, LSS, KOA, and HOA, respectively. The ratio of incorrect answers were 92% for CM, and 0% for all others. Intra-rater variability was 1.0, 0.15, 0.7, 0.6, and 0.6 for CTS, CM, LSS, KOA, and HOA, respectively; inter-rater variability was 1.0, 0.1, 0.64, -0.12, and 0.04 for CTS, CM, LSS, KOA, and HOA, respectively. The phrases, “essential”, “recommended”, “best”, and “important” were occurred with answers that recommended seeking medical attention. “Essential” occurred 3.2%, “recommended” at 9.6%, “best” at 6.4%, and “important” at 75.2%. Around 5.6% of the answers did not have recommendations to seek medical attention.

Conclusions:

The accuracy/precision of ChatGPT to “self-diagnose” 5 common orthopedic conditions was inconsistent. The accuracy could potentially be improved by adding symptoms that could easily identify a specific location. Only a few answers (12.8%) had a strong recommendation to seek medical attention by our study standards. Although ChatGPT could serve as a potential first step to access to care, we found variability in an accurate “self-diagnosis”. Given the risk of harm with “self-diagnosis” without medical followup, it would be prudent for a NLP to include direct language alerting patients to seek an expert opinion. We hope to shed further light on the use of AI in a future clinical study.

Citation

Please cite as:

Kuroiwa T, Sarcon A, Ibara T, Yamada E, Yamamoto A, Tsukamoto K, Fujita K

The Potential of ChatGPT as a Self-Diagnostic Tool in Common Orthopedic Diseases: Exploratory Study

J Med Internet Res 2023;25:e47621

DOI: 10.2196/47621

PMID: 37713254

PMCID: 10541638

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Mar 27, 2023

Date Accepted: Aug 17, 2023

The potential of ChatGPT as a self-diagnostic tool in common orthopedic diseases: An exploratory study

ABSTRACT

Citation

Copyright