Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Formative Research

Date Submitted: Dec 6, 2025
Date Accepted: May 25, 2026

The final, peer-reviewed published version of this preprint can be found here:

Fine-Tuning Large Language Models for Motivational Interviewing in Health Behavior Change: Development and Evaluation Study

Hu Rz, Yang Y, Yang Yh, Kong Jq, Luo Jh, Yang Wy, Chen J, Liu Jy, Zeng Hq, Lei Z, Liu Z

Fine-Tuning Large Language Models for Motivational Interviewing in Health Behavior Change: Development and Evaluation Study

JMIR Form Res 2026;10:e89077

DOI: 10.2196/89077

PMID: 42341298

Fine-Tuning Large Language Models for Motivational Interviewing in Health Behavior Change: A Development and Evaluation Study

  • Run-ze Hu; 
  • Yang Yang; 
  • Yi-hang Yang; 
  • Jing-qi Kong; 
  • Jia-hui Luo; 
  • Wen-yu Yang; 
  • Jing Chen; 
  • Jing-yao Liu; 
  • Hui-qun Zeng; 
  • Zhang Lei; 
  • Zheng Liu

ABSTRACT

Background:

Motivational interviewing (MI) is an effective counseling approach for promoting health behavior change, but its impact is constrained by the need for highly trained human counselors.

Objective:

This study aimed to explore a scalable alternative by developing and evaluating Large Language Models for Motivational Interviewing (MI-LLMs).

Methods:

We first curated five Chinese psychological counseling corpora and, using GPT-4 with an MI-informed prompt, transcribed multi-turn dialogues from the two highest-quality datasets (CPsyCounD and PsyDTCorpus) into 2,040 MI-style counseling conversations, of which 2,000 were used for training and 40 for testing. Three Chinese-capable open-source LLMs (Baichuan2-7B-Chat, ChatGLM-4-9B-Chat and Llama-3-8B-Chinese-Chat-v2) were fine-tuned on this corpus and were named as MI-LLMs. We evaluated MI-LLMs using round-based automatic metrics and expert manual coding with the Motivational Interviewing Treatment Integrity (MITI) Coding Manual 4.2.1.

Results:

Across all three models, fine-tuning substantially improved BLEU-4 and ROUGE scores compared with the base models, and manual coding showed that MI-LLMs achieved technical and relational global scores, and MI-adherent ratios that approached those of real MI dialogues, although complex reflections and reflection-to-question ratios remained less frequent.

Conclusions:

These findings provide initial evidence that MI-oriented fine-tuning can endow general-purpose LLMs with core MI-consistent counseling behaviors, suggesting a scalable pathway toward AI-assisted health behavior change support while underscoring the need for further work on data scale, complex MI skills and real-world intervention trials.


 Citation

Please cite as:

Hu Rz, Yang Y, Yang Yh, Kong Jq, Luo Jh, Yang Wy, Chen J, Liu Jy, Zeng Hq, Lei Z, Liu Z

Fine-Tuning Large Language Models for Motivational Interviewing in Health Behavior Change: Development and Evaluation Study

JMIR Form Res 2026;10:e89077

DOI: 10.2196/89077

PMID: 42341298

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.