Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Jan 18, 2025
Date Accepted: Dec 31, 2025

The final, peer-reviewed published version of this preprint can be found here:

Temporal Annotation of German Clinical Language in Real and Synthetic Clinical Documents: Corpus Development and Baseline Tagger Validation Study

Modersohn L, Hahn U

Temporal Annotation of German Clinical Language in Real and Synthetic Clinical Documents: Corpus Development and Baseline Tagger Validation Study

J Med Internet Res 2026;28:e71458

DOI: 10.2196/71458

PMID: 41740143

PMCID: 12980054

Temporal Annotation of German Clinical Language in Real and Synthetic Clinical Documents: A Corpus Development and Baseline Tagger Validation Study

  • Luise Modersohn; 
  • Udo Hahn

ABSTRACT

Background:

Temporal information about patients constitutes a precious source for clinical decision making and medical treatment. The automatic extraction of such data from unstructured clinical narratives—with the over-arching goal to construct consistent, maximally complete, and coherent clinical timelines from the patients’ clinical history—requires time-annotated clinical reports and notes from which time-informed taggers can be learned. The non-English clinical language community, the German one as a typical example, suffers from the systematic lack of such training resources.

Objective:

In order to overcome the lack of comprehensive and shareable temporally annotated German clinical corpora, we developed an annotation schema for both temporal entities and temporal relations adapted to the needs of German medical language. Based on the annotations derived therefrom we then trained state-of-the-art baseline classifiers for this task.

Methods:

Starting from two already available temporal annotation guidelines for English clinical documents, we developed a first version of annotation guidelines for both temporal named entities and temporal relations for the German language. These guidelines were subsequently refined and adapted to German clinical documents in an iterative way incorporating the work experience of five clinically trained annotators (students of medicine). This was done on randomly selected smaller subsets of two German clinical corpora, a real-world one (3000PAJ) and a synthetic one (GraSCCo). Afterwards, both corpora were fully annotated with 10 % randomly selected documents as agreement part on 3000PAJ and on the entire GraSCCo corpus. For measuring inter annotator agreement (IAA), we used pairwise F1-scores. We then used those metadata to train and fine-tune BERT-based models for the creation of baseline language models capable of automatically recognizing temporal named entities and temporal relations in German-language clinical documents.

Results:

In order to break the metadata bottleneck for clinical German language, we created 3000PAJ-temp, a time-annotated corpus of real clinical documents (which cannot be distributed because of the rigid privacy policy enforced for German clinical data), and GraSCCo-temp, a synthetic one (which is publicly available without any restrictions). With our final guidelines we achieved an IAA F1-score of 0.9 on both corpora for the temporal named entity task. For the temporal relation task, the IAA on GraSCCo amounts to an F1-score of 0.57 and 0.41 on 3000PAJ, respectively. All those results are in line with previously reported agreement values on English datasets. Our baseline classifier for named entities achieved F1-scores between 0.64 and 0.85, depending on the combination of train-test datasets. For automatic relation extraction, we achieved F1-scores between 0.60 and 0.64.

Conclusions:

We discuss our TimeML-compliant annotation scheme for German clinical language, details of the annotation campaign and the measurement of inter-annotator agreement, and, finally, provide baseline classifiers for temporal tagging of German-language clinical documents.


 Citation

Please cite as:

Modersohn L, Hahn U

Temporal Annotation of German Clinical Language in Real and Synthetic Clinical Documents: Corpus Development and Baseline Tagger Validation Study

J Med Internet Res 2026;28:e71458

DOI: 10.2196/71458

PMID: 41740143

PMCID: 12980054

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.