JMIR Preprints #89077: Fine-Tuning Large Language Models for Motivational Interviewing in Health Behavior Change: A Development and Evaluation Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Fine-Tuning Large Language Models for Motivational Interviewing in Health Behavior Change: A Development and Evaluation Study

Run-ze Hu;
Yang Yang;
Yi-hang Yang;
Jing-qi Kong;
Jia-hui Luo;
Wen-yu Yang;
Jing Chen;
Jing-yao Liu;
Hui-qun Zeng;
Zhang Lei;
Zheng Liu

ABSTRACT

Background:

Motivational interviewing (MI) is an effective counseling approach for promoting health behavior change, but its impact is constrained by the need for highly trained human counselors.

Objective:

This study aimed to explore a scalable alternative by developing and evaluating Large Language Models for Motivational Interviewing (MI-LLMs).

Methods:

We first curated five Chinese psychological counseling corpora and, using GPT-4 with an MI-informed prompt, transcribed multi-turn dialogues from the two highest-quality datasets (CPsyCounD and PsyDTCorpus) into 2,040 MI-style counseling conversations, of which 2,000 were used for training and 40 for testing. Three Chinese-capable open-source LLMs (Baichuan2-7B-Chat, ChatGLM-4-9B-Chat and Llama-3-8B-Chinese-Chat-v2) were fine-tuned on this corpus and were named as MI-LLMs. We evaluated MI-LLMs using round-based automatic metrics and expert manual coding with the Motivational Interviewing Treatment Integrity (MITI) Coding Manual 4.2.1.

Results:

Across all three models, fine-tuning substantially improved BLEU-4 and ROUGE scores compared with the base models, and manual coding showed that MI-LLMs achieved technical and relational global scores, and MI-adherent ratios that approached those of real MI dialogues, although complex reflections and reflection-to-question ratios remained less frequent.

Conclusions:

These findings provide initial evidence that MI-oriented fine-tuning can endow general-purpose LLMs with core MI-consistent counseling behaviors, suggesting a scalable pathway toward AI-assisted health behavior change support while underscoring the need for further work on data scale, complex MI skills and real-world intervention trials.

Citation

Please cite as:

Hu Rz, Yang Y, Yang Yh, Kong Jq, Luo Jh, Yang Wy, Chen J, Liu Jy, Zeng Hq, Lei Z, Liu Z

Fine-Tuning Large Language Models for Motivational Interviewing in Health Behavior Change: Development and Evaluation Study

JMIR Form Res 2026;10:e89077

DOI: 10.2196/89077

PMID: 42341298

PMCID: 13293567

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Formative Research

Date Submitted: Dec 6, 2025

Date Accepted: May 25, 2026

Fine-Tuning Large Language Models for Motivational Interviewing in Health Behavior Change: A Development and Evaluation Study

ABSTRACT

Citation

Copyright