JMIR Preprints #99901: Bridging the Gap: Evaluating Large Language Models for Depression Support Through a Dual-Perspective of Doctors and Patients and the Emergence of a “Bridging Communication” Paradigm

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Bridging the Gap: Evaluating Large Language Models for Depression Support Through a Dual-Perspective of Doctors and Patients and the Emergence of a “Bridging Communication” Paradigm

Yang Bai;
Qinye Zhou;
Longsheng Pan

ABSTRACT

Background:

Large language models (LLMs) have demonstrated potential as auxiliary tools in digital health scenarios, such as depression management. However, their effectiveness depends on their ability to meet both rigorous professional standards and individualised patient needs. Currently, a gap exists in research that systematically evaluates the quality of LLM responses from both medical and patient perspectives, hindering the development of “patient-centred” medical artificial intelligence.

Objective:

This study aimed to develop a dual-paradigm evaluation framework that integrates professional-safety and experience-practicality perspectives, and to systematically compare how clinicians and patients evaluate LLM-generated responses to common questions about depression, in order to identify communication features that can bridge the cognitive divide.

Methods:

We selected the 10 most frequently asked questions from patients with depression and generated responses using four mainstream Chinese LLMs (DeepSeek-V3.2, GLM-4.6, Qwen-3-Max, and Kimi-k2-thinking). Ten psychiatrists and 130 clinically diagnosed patients with depression were invited to independently conduct blind scoring from their respective professional or experiential perspectives across six evaluation dimensions.

Results:

Significant differences were found between healthcare providers and patients across all evaluation dimensions (p < 0.05), with the greatest perceptual gap observed in “safety boundaries and risk awareness” (effect size r = 0.38). Key findings include: (1) The symbiosis of safety and empathy: From the patient’s perspective, perceived “safety” of a response was highly positively correlated with its “linguistic approachability” (ρ > 0.5), in stark contrast to the negative correlation observed in the physician group (ρ = -0.289). This suggests that safety warnings incorporating expressions of empathy are more likely to gain patient acceptance and trust. (2) Structural differences in evaluation logic: Patients tended to evaluate “clarity”, “practicality”, and “approachability” as an integrated whole (strong positive correlations), whereas doctors were able to assess “medical accuracy” as an independent core metric.

Conclusions:

Based on these findings, this study proposes that “bridging communication” should serve as the core developmental paradigm for future medical AI. This paradigm emphasises that an effective AI response requires a delicate balance between professional rigour and individual relevance, centring on two key transformations: translating standardised medical language into personal narratives that resonate with patients’ lived experiences, and transforming structured knowledge into actionable, personalised guidance. The best-performing models in this study (GLM-4.6 and Kimi-k2-thinking) demonstrated preliminary evidence of this “bridging” characteristic in their responses. This study not only evaluates existing models but, more importantly, provides a crucial theoretical framework and empirical basis for building the next generation of medical AI assistants that possess genuine communicative intelligence, empower patients, and support clinical practice. Clinical Trial: NONE

Citation

Please cite as:

Bai Y, Zhou Q, Pan L

Bridging the Gap: Evaluating Large Language Models for Depression Support Through a Dual-Perspective of Doctors and Patients and the Emergence of a “Bridging Communication” Paradigm

JMIR Preprints. 18/05/2026:99901

DOI: 10.2196/preprints.99901

URL: https://preprints.jmir.org/preprint/99901

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Currently submitted to: JMIR Formative Research

Date Submitted: May 18, 2026

Open Peer Review Period: May 19, 2026 - Jul 14, 2026

(currently open for review)

Bridging the Gap: Evaluating Large Language Models for Depression Support Through a Dual-Perspective of Doctors and Patients and the Emergence of a “Bridging Communication” Paradigm

ABSTRACT

Citation

Copyright