Accepted for/Published in: JMIR Formative Research
Date Submitted: Jan 16, 2025
Date Accepted: Mar 11, 2025
Assessment of recommendations provided to athletes regarding sleep education by GPT-4o and Google Gemini: A comparative evaluation study
ABSTRACT
Background:
Inadequate sleep is prevalent among athletes, affecting adaption to training and performance. While education on factors influencing sleep can improve sleep behaviors, Large Language Models (LLMs) may offer a scalable approach to provide sleep education to athletes.
Objective:
This study aims to i) investigate the quality of sleep recommendations generated by publicly available LLMs, as evaluated by experienced raters, and ii) determine whether recommendations quality varies with information input granularity.
Methods:
Two prompts with differing information input granularity (low and high) were created for two use cases and inserted into ChatGPT-4o (GPT-4o) and Google Gemini, resulting in n=8 different recommendations. Experienced raters (n=13) evaluated the recommendations on a 1-5 Likert-scale, based on n=10 sleep criteria derived from recent literature.
Results:
The highest summary rating was achieved by GPT-4o using high input information granularity, with n=8 ratings >3 (tendency towards good), n=3 ratings equal 3 (neutral), n=2 ratings <3 (tendency towards bad). GPT-4o significantly outperformed Google Gemini in 9 out of 10 criteria (P<.001 to P=.045). Recommendations generated with high input granularity received significantly higher ratings than those with low granularity across both LLMs and use cases (P<0.001 to P=.049).
Conclusions:
Both LLMs exhibit limitations, neglecting vital criteria of sleep education. Sleep recommendations by GPT-4o and Google Gemini were evaluated as suboptimal, with GPT-4o achieving higher overall quality. However, both LLMs demonstrated improved recommendation quality with higher information input granularity, emphasizing the need for specificity and a thorough review of outputs to securely implement AI technologies into sleep education.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.