JMIR Preprints #96347: Multilingual Evidence-Based Question Answering for Stroke Discharge Summaries: Study of Cross-Lingual Heterogeneity in Clinical Reports

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Multilingual Evidence-Based Question Answering for Stroke Discharge Summaries: Study of Cross-Lingual Heterogeneity in Clinical Reports

Vojtěch Lanz;
Aleksis Datseris;
Svetla Boytcheva;
Jiří Mayer;
Šárka Zikánová;
Robert Mikulík;
Pavel Pecina

ABSTRACT

Background:

The Registry of Stroke Care Quality (RES-Q) is healthcare quality improvement platform used globally. RES-Q collects structured quality-of-care data for stroke patients, requiring clinicians to manually extract information from electronic health records or documents such as discharge summaries. This process is essential but time-consuming, particularly given the variability, length, and semi-structured nature of clinical reports.

Objective:

To develop and evaluate a multilingual Evidence-Based Question-Answering framework that identifies supporting text spans in clinical reports of stroke patients and proposes answer suggestions for structured clinical forms, with the goal of reducing clinician workload while preserving full human oversight.

Methods:

We conduct a multilingual study using 1,596 pseudonymized stroke discharge summaries in six languages, annotated with question-evidence-answer triplets. Encoder-based language models are used to extract evidence spans from the reports, while generative language models are used to predict normalized form answers based on the extracted evidences. We compare multiple training strategies: models trained on reports in a single target language, models trained jointly on reports in different languages, and models trained on original reports combined with cross-lingual data augmentations. We evaluate performance on Evidence Extraction, Answer Prediction, and end-to-end Evidence-Based Question Answering across the six languages.

Results:

The presented Evidence-Based Question-Answering system achieves 89% end-to-end accuracy in form filling across six languages (77% for patient-specific questions and 95% for default or unverifiable items). Evidence Extraction is the primary bottleneck, reaching 85% F1 and 79% Exact Match, whereas Answer Prediction based on extracted evidences is more stable, achieving 95% accuracy. The performance varies by question type, and cross-lingual training generally reduces Evidence Extraction performance but has little effect on Answer Prediction. Model performance is influenced more by reporting practices and dataset characteristics than by language itself.

Conclusions:

Evidence-Based Question Answering over multilingual stroke discharge summaries enables human-in-the-loop validation and effective answer prediction with moderate computational resources. Evidence Extraction is the main bottleneck, while Answer Prediction is robust across languages and model sizes. The approach supports structured data collection, though generalization to new languages requires target-language training data.

Citation

Please cite as:

Lanz V, Datseris A, Boytcheva S, Mayer J, Zikánová �, Mikulík R, Pecina P

Multilingual Evidence-Based Question Answering for Stroke Discharge Summaries: Study of Cross-Lingual Heterogeneity in Clinical Reports

JMIR Preprints. 08/04/2026:96347

DOI: 10.2196/preprints.96347

URL: https://preprints.jmir.org/preprint/96347

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Currently submitted to: Journal of Medical Internet Research

Date Submitted: Apr 8, 2026

Open Peer Review Period: Apr 9, 2026 - Jun 4, 2026

(currently open for review)

Multilingual Evidence-Based Question Answering for Stroke Discharge Summaries: Study of Cross-Lingual Heterogeneity in Clinical Reports

ABSTRACT

Citation

Copyright