Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Dec 27, 2023
Open Peer Review Period: Jan 8, 2024 - Mar 4, 2024
Date Accepted: Mar 19, 2024
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Leveraging Large Language Models for Improved Patient Access and Self-Management: Assessor-Blinded Comparison Between Expert- and AI-Generated Content

Lv X, Zhang X, Li Y, Ding X, Lai H, Shi J

Leveraging Large Language Models for Improved Patient Access and Self-Management: Assessor-Blinded Comparison Between Expert- and AI-Generated Content

J Med Internet Res 2024;26:e55847

DOI: 10.2196/55847

PMID: 38663010

PMCID: 11082737

Leveraging Large Language Models for Improved Patient Access and Self-Management in Oral Healthcare: An Assessor-blinded Preclinical Study

  • Xiaolei Lv; 
  • Xiaomeng Zhang; 
  • Yuan Li; 
  • Xinxin Ding; 
  • Hongchang Lai; 
  • Junyu Shi

ABSTRACT

Background:

While Large Language Models like ChatGPT and Google Bard have shown significant promise in various fields, their broader impact on enhancing patient healthcare access and quality, particularly in specialized domains like oral health, requires comprehensive evaluation.

Objective:

This study aims to assess the effectiveness of Google Bard, ChatGPT-3.5, and ChatGPT-4 in offering recommendations for common oral health issues, benchmarked against responses from human dental experts.

Methods:

This comparative analysis utilized forty questions derived from patient surveys on prevalent oral diseases, executed in a simulated clinical environment. Responses were sourced from both human experts and Large Language Models, evaluating them on readability, appropriateness, harmlessness, comprehensiveness, intent capture, and helpfulness, as evaluated by experienced dentists and lay users, respectively. Additionally, the stability of AI responses was also assessed by submitting each question three times under consistent conditions.

Results:

Google Bard exhibited the best readability among all groups but scored significantly lower in appropriateness compared to human experts (8.51 ± 0.37 VS. 9.60 ± 0.33, P = .034), while ChatGPT-3.5 and 4 performed comparably with human experts in appropriateness (8.96 ± 0.35 and 9.34 ± 0.47, respectively). All three Large Language Models received superior harmlessness score, comparable to human experts. Lay users found no significant difference in helpfulness and intent capture between Large Language Models and human experts. Stability evaluation revealed ChatGPT-4 as the most reliable, with the highest number of correct responses and the least number of incorrect and unreliable responses.

Conclusions:

Large Language Models, particularly ChatGPT-4, show potential in oral healthcare, providing patient-centric information for enhancing patient education and clinical care. The observed performance variations underscore the need for ongoing refinement and ethical considerations in healthcare settings. Future research focus on developing strategies for safe integration of Large Language Models in healthcare settings. Clinical Trial: NA


 Citation

Please cite as:

Lv X, Zhang X, Li Y, Ding X, Lai H, Shi J

Leveraging Large Language Models for Improved Patient Access and Self-Management: Assessor-Blinded Comparison Between Expert- and AI-Generated Content

J Med Internet Res 2024;26:e55847

DOI: 10.2196/55847

PMID: 38663010

PMCID: 11082737

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.