JMIR Preprints #87586: Identifying the Presence and Timing of Self-harm in Electronic Mental Health Records Using Privacy-Preserving Local Language Models: Methodological Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Identifying the Presence and Timing of Self-harm in Electronic Mental Health Records Using Privacy-Preserving Local Language Models: Methodological Study

Andrey Kormilitzin;
Dan W Joyce;
Apostolos Tsiachristas;
Rohan Borschmann;
Navneet Kapur;
Galit Geulayov

ABSTRACT

Background:

Self-harm is the strongest risk factor for suicide and an important outcome for mental health care. Although prevalent in clinical populations, it is often imprecisely captured in routinely collected clinical data, where it is often recorded and stored as unstructured free text. Contemporary language models, such as GPT (OpenAI), Gemini (Google) can analyse free-text clinical notes, but such cloud-based commercial and closed-source models may violate data governance of processing sensitive patient data.

Objective:

To evaluate whether a privacy-preserving language model running entirely within an institution’s secure computing infrastructure (here, the UK National Health Service; NHS) could accurately identify the presence and timing of self-harm using electronic health records (EHRs) from secondary mental healthcare.

Methods:

Clinical notes were drawn from Oxford Health NHS Foundation Trust using a multi-stage workflow: (1) a random sample of 1,000 patients with a psychiatric diagnosis (ICD-10 F00–F99); (2) candidate-note identification using a Gemma3-4b language model to flag notes containing self-harm content; (3) from those candidates, 1,352 randomly sampled notes were selected for expert annotation. The resulting gold-standard corpus is therefore enriched for self-harm content. Each clinical note was annotated for the presence/absence of self-harm and its timing (≤90 days/>90 days/unknown). A privacy-preserving locally served 27-billion-parameter Gemma 3 language model ('Gemma3-27b') was used as the core model. Prompts were systematically developed and refined using a labelled development set to identify self-harm and generate a structured output per clinical record. The performance of Gemma3-27b model was compared against a strong baseline multi-label text classification model based on RoBERTa (Robustly Optimized BERT Pretraining Approach, a transformer-based language model) architecture. Model performance was evaluated using precision, recall, and the F1-score (harmonic mean of precision and recall), with 95% confidence intervals estimated from 1,000 bootstrap samples with replacement.

Results:

Gemma3-27boutperformed the RoBERTa classifier across all categories, achieving Precision=0.92, Recall=0.92 (sensitivity), and F1-score=0.92 for notes containing self-harm, and Precision=0.97, Recall=0.97 (specificity), and F1-score=0.97 for notes without self-harm. For the 51 notes labelled as recent self-harm in the held-out test set, Gemma3-27b achieved Precision=0.84, Recall=0.75, and F1-score=0.79. The global weighted F1-score of Gemma3-27b across all categories was 0.88, compared to 0.85 for RoBERTa.

Conclusions:

With systematic prompt development on a labelled development set, but no gradient-based fine-tuning, the current Gemma3-27b language model matched or exceeded a fine-tuned RoBERTa classifier for ascertaining self-harm events and their timing. Aggregate gains were modest, while improvements were largest in the most challenging, lower-frequency timing categories. On a simplified binary recent-versus-other task, RoBERTa performed marginally better, indicating that supervised classifiers remain highly effective when the task is simplified and sufficient labelled data exist. This work demonstrates the technical feasibility of privacy-preserving self-harm detection within a secure NHS research environment. Clinical Trial: None

Citation

Please cite as:

Kormilitzin A, Joyce DW, Tsiachristas A, Borschmann R, Kapur N, Geulayov G

Detection of Self-Harm in Electronic Mental Health Records Using Privacy-Preserving Local Language Models: Methodological Study

JMIR Ment Health 2026;13:e87586

DOI: 10.2196/87586

PMID: 42227874

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Mental Health

Date Submitted: Nov 12, 2025

Date Accepted: Mar 9, 2026

Identifying the Presence and Timing of Self-harm in Electronic Mental Health Records Using Privacy-Preserving Local Language Models: Methodological Study

ABSTRACT

Citation

Copyright