JMIR Preprints #58041: Enhancement of Large Language Models' Performance in Diabetes Education: Retrieval-Augmented Generation Approach

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Enhancement of Large Language Models' Performance in Diabetes Education: Retrieval-Augmented Generation Approach

Dingqiao Wang;
Jiangbo Liang;
Jinguo Ye;
Jingni Li;
Jingpeng Li;
Qikai Zhang;
Qiuling Hu;
Caineng Pan;
Dongliang Wang;
Zhong Liu;
Wen Shi;
Danli Shi;
Fei Li;
Bo Qu;
Yingfeng Zheng

ABSTRACT

Background:

Large language models (LLMs) demonstrated advanced performance in processing clinical information. However, commercially available LLMs lack specialized medical knowledge and remain susceptible to generating inaccurate information. Given the need for self-management in diabetes, patients commonly seek information online. We introduce the RISE framework and evaluate its performance in enhancing LLMs to provide accurate responses to diabetes-related inquiries.

Objective:

This study aimed to evaluate the potential of RISE framework, an information retrieval and augmentation tool, to improve the LLM's performance to accurately respond to diabetes-related inquiries.

Methods:

The RISE, an innovative Retrieval Augmentation framework, comprises four steps: Rewriting Query, Information Retrieval, Summarization, and Execution. Using a set of 43 common diabetes-related questions, we evaluated three base LLMs (GPT-4, Anthropic Claude 2, Google Bard) and their RISE-enhanced versions. Assessments were conducted by clinicians for accuracy and comprehensiveness, and by patient for understandability.

Results:

The integration of RISE significantly improved the accuracy and comprehensiveness of responses from all three based LLMs. On average, the percentage of accurate responses increased by 10.9% with RISE. The rates of accurate responses increased by 7.0% for GPT-4, 16.3% for Claude 2, and 9.3% for Google Bard. The framework also enhanced response comprehensiveness, with mean scores improving by 0.44. Understandability was also enhanced by 0.19 in average.

Conclusions:

RISE significantly improves LLMs' performance in diabetes-related inquiries, enhancing accuracy, comprehensiveness, and understandability. These improvements have crucial implications for RISE's future role in patient education and chronic illness self-management, which contributes to relieving medical resource pressures and raising public awareness of medical knowledge.

Citation

Please cite as:

Wang D, Liang J, Ye J, Li J, Li J, Zhang Q, Hu Q, Pan C, Wang D, Liu Z, Shi W, Shi D, Li F, Qu B, Zheng Y

Enhancement of the Performance of Large Language Models in Diabetes Education through Retrieval-Augmented Generation: Comparative Study

J Med Internet Res 2024;26:e58041

DOI: 10.2196/58041

PMID: 39046096

PMCID: 11584532

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Mar 4, 2024

Open Peer Review Period: Mar 4, 2024 - Apr 29, 2024

Date Accepted: Jul 15, 2024

Date Submitted to PubMed: Jul 24, 2024

(closed for review but you can still tweet)

Enhancement of Large Language Models' Performance in Diabetes Education: Retrieval-Augmented Generation Approach

ABSTRACT

Citation

Copyright