JMIR Preprints #94454: Utilizing Large-Language Model for the Automatic Extraction of Clinical Course Information in Psychiatric Disorders

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Utilizing Large-Language Model for the Automatic Extraction of Clinical Course Information in Psychiatric Disorders

Chien-Hung Chen;
Hong-Jie Dai;
Chu-Hsien Su;
Shi-Heng Wang;
Yi-Ling Chien;
Wei-Lieh Huang;
Chi-Shin Wu;
Hsin-Hsi Chen

ABSTRACT

Background:

Understanding the clinical course of psychiatric disorders is vital for informed decision-making. Extracting details such as onset time, episode count, and hospitalization history from unstructured psychiatric notes remains challenging.

Objective:

This study evaluates the performance of large language models (LLMs)—LLaMA, MentaLLaMA, OpenBioLLM, and Mistral—in extracting temporal and event-based clinical information from psychiatric discharge summaries.

Methods:

We used 500 annotated discharge summaries from the NTUH-iMD, covering psychiatric diagnoses (ICD-9-CM: 290–319, ICD-10-CM: F00–F99). Key temporal and clinical course features were annotated by a psychiatrist and an NLP researcher. A two-stage extraction process was implemented: first, sentence-level models identified clinical events and temporal cues; then, a chart-level model predicted four clinical course features: onset time, episode count, number of hospitalizations, and most recent hospitalization time. Performance was evaluated using F1-scores.

Results:

Among 12,947 analyzed sentences, 7,177 included clinical events and 4,842 contained temporal information. Mistral achieved the best performance in event extraction (episodes: 0.968; hospitalizations: 0.933; remission/response: 0.901) and temporal information (age: 0.968; time expressions: 0.967; duration: 0.901). In chart-level extraction, F1-scores were highest for Mistral in onset time (0.714), episode count (0.624), number of hospitalizations (0.676), and last hospitalization time (0.842).

Conclusions:

Fine-tuned LLMs, especially Mistral, can accurately extract structured clinical course information from psychiatric notes, offering a scalable alternative to manual review. Future work should address vague temporal expressions, expand feature sets, and validate generalizability across diverse settings. Clinical Trial: None

Citation

Please cite as:

Chen CH, Dai HJ, Su CH, Wang SH, Chien YL, Huang WL, Wu CS, Chen HH

Utilizing Large-Language Model for the Automatic Extraction of Clinical Course Information in Psychiatric Disorders

JMIR Preprints. 02/03/2026:94454

DOI: 10.2196/preprints.94454

URL: https://preprints.jmir.org/preprint/94454

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Currently submitted to: JMIR Formative Research

Date Submitted: Mar 2, 2026

Open Peer Review Period: Mar 2, 2026 - Mar 2, 2026

(closed for review but you can still tweet)

NOTE: This is an unreviewed Preprint

Utilizing Large-Language Model for the Automatic Extraction of Clinical Course Information in Psychiatric Disorders

ABSTRACT

Citation

Copyright