Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Cardio

Date Submitted: Oct 10, 2023
Date Accepted: Feb 22, 2024

The final, peer-reviewed published version of this preprint can be found here:

A Multidisciplinary Assessment of ChatGPT’s Knowledge of Amyloidosis: Observational Study

King RC, Samaan JS, Yeo YH, Peng Y, Kunkel DC, Habib AA, Ghashghaei R

A Multidisciplinary Assessment of ChatGPT’s Knowledge of Amyloidosis: Observational Study

JMIR Cardio 2024;8:e53421

DOI: 10.2196/53421

PMID: 38640472

PMCID: 11069089

A Multidisciplinary Assessment of ChatGPT’s Knowledge of Amyloidosis: An Observational Study

  • Ryan C. King; 
  • Jamil S. Samaan; 
  • Yee Hui Yeo; 
  • Yuxin Peng; 
  • David C. Kunkel; 
  • Ali A. Habib; 
  • Roxana Ghashghaei

ABSTRACT

Background:

Amyloidosis, a rare multisystem condition, requires multidisciplinary care. Its low prevalence underscores the importance of patient education for better outcomes. The large language model (LLM) ChatGPT offers a potential avenue for disseminating accurate, reliable, accessible educational resources.

Objective:

We performed a multidisciplinary assessment of the accuracy and reproducibility of ChatGPT in answering questions related to amyloidosis.

Methods:

A total of 98 Amyloidosis questions related to cardiology, gastroenterology, and neurology were curated from medical societies, institutions and amyloidosis Facebook support groups and inputted into GPT-3.5 and GPT-4. Cardiology and Gastroenterology related responses were independently graded by a gastroenterologist and cardiologist who specialize in amyloidosis. Disagreements were resolved with discussion. Neurology related responses were graded by a neurologist who specializes in amyloidosis. Reviewers used the following grading scale: 1) Comprehensive 2) Correct but inadequate 3) Some correct and some incorrect 4) Completely incorrect. Questions were stratified by categories for further analysis. Reproducibility was assessed by inputting each question twice into each model.

Results:

GPT-4 provided 93/98 (94.9%) responses with accurate information, 82/98 (83.7%) of which were comprehensive. GPT-3.5 provided 74/83 (89.2%) responses with accurate information, 66/83 (79.5%) of which were comprehensive. When examined by question category, GTP-4 and GPT-3.5 provided 53/56 (94.6%) and 48/56 (85.7%) comprehensive responses, respectively, to “general questions”. When examined by subject, GPT-4 and GPT-3.5 performed best in response to cardiology questions with both models producing 10/12 (83.3%) comprehensive responses. For gastroenterology, GPT-4 received comprehensive grades for 9/15 (60.0%) of responses and GPT-3.5 provided 8/15 (53.3%). Overall, 97/98 (99.0%) of responses for GPT-4 and 78/83 (94.0%) for GPT-3.5 were reproducible.

Conclusions:

LLMs have potential as a supplemental tool in disseminating vital health education to patients living with amyloidosis. Prior to widespread implementation, the technology’s limitations and ethical implications must be further explored.


 Citation

Please cite as:

King RC, Samaan JS, Yeo YH, Peng Y, Kunkel DC, Habib AA, Ghashghaei R

A Multidisciplinary Assessment of ChatGPT’s Knowledge of Amyloidosis: Observational Study

JMIR Cardio 2024;8:e53421

DOI: 10.2196/53421

PMID: 38640472

PMCID: 11069089

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.