Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Education

Date Submitted: Jul 27, 2023
Date Accepted: Dec 11, 2023

The final, peer-reviewed published version of this preprint can be found here:

Comprehensiveness, Accuracy, and Readability of Exercise Recommendations Provided by an AI-Based Chatbot: Mixed Methods Study

Zaleski A, Berkowsky R, Craig KJT, Pescatello L

Comprehensiveness, Accuracy, and Readability of Exercise Recommendations Provided by an AI-Based Chatbot: Mixed Methods Study

JMIR Med Educ 2024;10:e51308

DOI: 10.2196/51308

PMID: 38206661

PMCID: 10811574

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Comprehensiveness, Accuracy, and Readability of Exercise Recommendations Provided by an Artificial Intelligence-Based Chatbot

  • Amanda Zaleski; 
  • Rachel Berkowsky; 
  • Kelly Jean Thomas Craig; 
  • Linda Pescatello

ABSTRACT

Background:

Regular physical activity is critical for health and disease prevention. Yet healthcare providers and patients face barriers to implement evidence-based lifestyle recommendations. The potential to augment care with the increased availability of artificial intelligence (AI) technologies is limitless; however, the suitability of AI-generated exercise recommendations has yet to be explored.

Objective:

The purpose of this study was to assess the comprehensiveness, accuracy, and readability of individualized exercise recommendations generated by a novel AI chatbot.

Methods:

A coding scheme was developed to score AI-generated exercise recommendations across ten categories informed by gold-standard exercise recommendations, including: 1) health condition-specific benefits of exercise, 2) exercise pre-participation health screening, 3) frequency, 4) intensity, 5) time, 6) type, 7) volume, 8) progression, 9) special considerations, and 10) references to primary literature. The AI chatbot was prompted to provide individualized exercise recommendations for 26 clinical populations using an open-source application programming interface. Two independent reviewers coded AI-generated content for each category and calculated comprehensiveness (%) and factual accuracy (%) on a scale of 0-100%. Readability was assessed using the Flesch-Kincaid formula. Qualitative analysis identified and categorized themes from AI-generated output.

Results:

AI-generated exercise recommendations were 41% comprehensive and 91% accurate, with the majority (53%) of inaccuracy related to the need for exercise pre-participation medical clearance. Average readability level of AI-generated exercise recommendations was at the college-level, with an average Flesch reading ease score of 31.1. Several recurring themes and observations of AI-generated output included concern for liability and safety; preference for aerobic exercise; and potential bias and direct discrimination against certain age-based populations and individuals with disabilities.

Conclusions:

There were notable gaps in comprehensiveness, accuracy, and readability of AI-generated exercise recommendations. Exercise and healthcare professionals should be aware of these limitations when using and/or endorsing AI-based technologies as a tool to support lifestyle change involving exercise.


 Citation

Please cite as:

Zaleski A, Berkowsky R, Craig KJT, Pescatello L

Comprehensiveness, Accuracy, and Readability of Exercise Recommendations Provided by an AI-Based Chatbot: Mixed Methods Study

JMIR Med Educ 2024;10:e51308

DOI: 10.2196/51308

PMID: 38206661

PMCID: 10811574

Per the author's request the PDF is not available.