Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: JMIR AI

Date Submitted: Jun 22, 2026
Open Peer Review Period: Jul 1, 2026 - Aug 26, 2026
(currently open for review)

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Deceptive Empathy in Large Language Model Responses to Suicide-Related Disclosures

  • Yesim Keskin; 
  • Tricia Park; 
  • Selen Bozkurt

ABSTRACT

Background:

Background:

Large language model (LLM) chatbots are increasingly used for emotional and mental health support. Users often rate their responses as helpful, supportive, and empathic, and recent studies suggest that LLM-generated replies can sometimes be judged as more empathic or therapeutically preferable than human-written responses. In suicide-related disclosures, however, such perceived empathy may introduce a distinct safety concern. By generating human-like language that implies relational understanding, care, or presence, LLMs may simulate an interpersonal capacity they do not possess. Iftikhar et al term this phenomenon “deceptive empathy”, referring to anthropomorphic, relationally simulating responses that may mislead users about the nature of the interaction [7].

Objective:

Objective:

This study aimed to quantify the prevalence of deceptive empathy in responses generated by commercially available large language models (LLMs) to suicide-related disclosures, a high-risk context requiring clinically appropriate and risk-sensitive responding.

Methods:

Methods:

Fifty posts were randomly sampled from the CLPsych 2019 r/SuicideWatch dataset [18]. Two licensed psychotherapists independently reviewed these posts and each selected 20 that representing high-risk suicide-related disclosures warranting immediate risk assessment and possible escalation to a higher level of care. After consensus review, 20 prompts were retained for model evaluation. In February 2026, each prompt was submitted once to 8 commercially available free-tier large language model (LLM) systems including ChatGPT 5.2, Gemini 2.5 Flash, Claude Haiku 4.5, DeepSeek V3.2, Perplexity, Grok 3, Llama 4, and Mistral. A total of 160 outputs are evaluated. Deceptive empathy was coded using a prespecified rubric by 3 independent coders. Interrater reliability before consensus adjudication was acceptable (Fleiss κ=0.66).

Results:

Results:

One response (0.6%) was excluded because it was generated in Spanish, leaving 159 evaluable outputs. Deceptive empathy was present in 96/159 responses (60.4%). Prevalence varied substantially across models, ranging from 0/20 (0.0%) to 20/20 (100.0%). Differences by model were statistically significant (χ^2^7=84.35; P<.001). Three models generated deceptive empathy in all responses, whereas 1 model produced none because it refused all prompts.

Conclusions:

Conclusions:

Deceptive empathy was common in LLM responses to suicide-related disclosures and varied sharply by model. In high-risk mental health contexts, perceived empathy should not be treated as evidence of safety or clinical adequacy. Evaluations of mental health chatbots should separately assess deceptive empathy, risk acknowledgment, escalation guidance, and refusal behavior.


 Citation

Please cite as:

Keskin Y, Park T, Bozkurt S

Deceptive Empathy in Large Language Model Responses to Suicide-Related Disclosures

JMIR Preprints. 22/06/2026:105273

DOI: 10.2196/preprints.105273

URL: https://preprints.jmir.org/preprint/105273

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.