Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Mar 4, 2024
Open Peer Review Period: Mar 4, 2024 - Apr 29, 2024
Date Accepted: Jul 15, 2024
Date Submitted to PubMed: Jul 24, 2024
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Enhancement of the Performance of Large Language Models in Diabetes Education through Retrieval-Augmented Generation: Comparative Study

Wang D, Liang J, Ye J, Li J, Li J, Zhang Q, Hu Q, Pan C, Wang D, Liu Z, Shi W, Shi D, Li F, Qu B, Zheng Y

Enhancement of the Performance of Large Language Models in Diabetes Education through Retrieval-Augmented Generation: Comparative Study

J Med Internet Res 2024;26:e58041

DOI: 10.2196/58041

PMID: 39046096

PMCID: 11584532

Enhancement of Large Language Models' Performance in Diabetes Education: Retrieval-Augmented Generation Approach

  • Dingqiao Wang; 
  • Jiangbo Liang; 
  • Jinguo Ye; 
  • Jingni Li; 
  • Jingpeng Li; 
  • Qikai Zhang; 
  • Qiuling Hu; 
  • Caineng Pan; 
  • Dongliang Wang; 
  • Zhong Liu; 
  • Wen Shi; 
  • Danli Shi; 
  • Fei Li; 
  • Bo Qu; 
  • Yingfeng Zheng

ABSTRACT

Background:

Large language models (LLMs) demonstrated advanced performance in processing clinical information. However, commercially available LLMs lack specialized medical knowledge and remain susceptible to generating inaccurate information. Given the need for self-management in diabetes, patients commonly seek information online. We introduce the RISE framework and evaluate its performance in enhancing LLMs to provide accurate responses to diabetes-related inquiries.

Objective:

This study aimed to evaluate the potential of RISE framework, an information retrieval and augmentation tool, to improve the LLM's performance to accurately respond to diabetes-related inquiries.

Methods:

The RISE, an innovative Retrieval Augmentation framework, comprises four steps: Rewriting Query, Information Retrieval, Summarization, and Execution. Using a set of 43 common diabetes-related questions, we evaluated three base LLMs (GPT-4, Anthropic Claude 2, Google Bard) and their RISE-enhanced versions. Assessments were conducted by clinicians for accuracy and comprehensiveness, and by patient for understandability.

Results:

The integration of RISE significantly improved the accuracy and comprehensiveness of responses from all three based LLMs. On average, the percentage of accurate responses increased by 10.9% with RISE. The rates of accurate responses increased by 7.0% for GPT-4, 16.3% for Claude 2, and 9.3% for Google Bard. The framework also enhanced response comprehensiveness, with mean scores improving by 0.44. Understandability was also enhanced by 0.19 in average.

Conclusions:

RISE significantly improves LLMs' performance in diabetes-related inquiries, enhancing accuracy, comprehensiveness, and understandability. These improvements have crucial implications for RISE's future role in patient education and chronic illness self-management, which contributes to relieving medical resource pressures and raising public awareness of medical knowledge.


 Citation

Please cite as:

Wang D, Liang J, Ye J, Li J, Li J, Zhang Q, Hu Q, Pan C, Wang D, Liu Z, Shi W, Shi D, Li F, Qu B, Zheng Y

Enhancement of the Performance of Large Language Models in Diabetes Education through Retrieval-Augmented Generation: Comparative Study

J Med Internet Res 2024;26:e58041

DOI: 10.2196/58041

PMID: 39046096

PMCID: 11584532

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.