JMIR Preprints #74111: Purrfessor: A Fine-tuned Multimodal LLaVA Diet Health Chatbot

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Purrfessor: A Fine-tuned Multimodal LLaVA Diet Health Chatbot

Linqi Lu;
Yifan Deng;
Chuan Tian;
Sijia Yang;
Dhavan Shah

ABSTRACT

Background:

The integration of Large Language-and-Vision Assistant (LLaVA) models with food and nutrition data enables multimodal meal analysis and contextual dietary guidance. However, little is known about how anthropomorphic chatbot design and AI-driven meal analysis influence user engagement and perception in health-related contexts.

Objective:

This study introduces Purrfessor, an innovative AI chatbot designed to provide personalized dietary guidance through interactive, multimodal engagement. The chatbot aims to deliver real-time, evidence-based support for food choices while examining the impact of anthropomorphism on user interaction and perception.

Methods:

The Purrfessor chatbot was trained using a combination of the FoodData Central database from the USDA, the Recipe2img dataset featuring food images and corresponding recipes, a curated human-annotated dataset derived from Recipe1M, and customized Q&A dialogue dataset. Two studies were conducted to evaluate chatbot performance and user experience. First, a simulation assessment using GPT-4 and human validation examined the accuracy and descriptive capabilities of the fine-tuned LLaVA model. Second, in-depth interviews (N = 10) were conducted to explore user perceptions of Purrfessor, focusing on its effectiveness, engagement, and usability.

Results:

The simulation study demonstrated that the fine-tuned LLaVA chatbot achieved a mean cosine similarity score of 0.78 (SD = 0.12) in semantic alignment with GPT-4 annotations, suggesting strong consistency in dietary image interpretation. Error analysis of low-scoring cases (n = 100) revealed current limitations, including ambiguity (25%), omissions (20%), and hallucinations (12%). Human validation scores indicated high chatbot performance across correctness (M = 7.87), relevance (M = 9.4), clarity (M = 9.6), and handling of edge cases (M = 9.0), with strong inter-rater reliability (Krippendorff’s α = 0.85–0.96). In-depth interviews identified three primary factors driving user engagement: responsiveness, personalization, and interaction guidance. Anthropomorphic cat persona applied in chatbot system can increase user interest and bonding, aligning with media equation theory and attachment theory in human-AI interaction.

Conclusions:

Findings highlight the role of anthropomorphic chatbot design and multimodal AI in improving user experience in diet health conversation. This study offers an example of AI-driven, evidence-based dietary guidance and underscores the potential of health chatbots to nudge informed health decision-making. Insights contribute to the development of digital health interventions and personalized health communication strategies, with implications for the design of engaging, user-centered AI health assistants.

Citation

Please cite as:

Lu L, Deng Y, Tian C, Yang S, Shah D

A Fine-Tuned Multimodal AI Chatbot for Dietary Health and Nutrition, Purrfessor: Development and Mixed Methods Evaluation

JMIR AI 2026;5:e74111

DOI: 10.2196/74111

PMID: 42061226

PMCID: 13132530

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR AI

Date Submitted: Mar 18, 2025

Date Accepted: Feb 27, 2026

Purrfessor: A Fine-tuned Multimodal LLaVA Diet Health Chatbot

ABSTRACT

Citation

Copyright