Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: Journal of Medical Internet Research

Date Submitted: Jan 29, 2026
Open Peer Review Period: Feb 2, 2026 - Mar 30, 2026
(closed for review but you can still tweet)

NOTE: This is an unreviewed Preprint

Warning: This is a unreviewed preprint (What is a preprint?). Readers are warned that the document has not been peer-reviewed by expert/patient reviewers or an academic editor, may contain misleading claims, and is likely to undergo changes before final publication, if accepted, or may have been rejected/withdrawn (a note "no longer under consideration" will appear above).

Peer review me: Readers with interest and expertise are encouraged to sign up as peer-reviewer, if the paper is within an open peer-review period (in this case, a "Peer Review Me" button to sign up as reviewer is displayed above). All preprints currently open for review are listed here. Outside of the formal open peer-review period we encourage you to tweet about the preprint.

Citation: Please cite this preprint only for review purposes or for grant applications and CVs (if you are the author).

Final version: If our system detects a final peer-reviewed "version of record" (VoR) published in any journal, a link to that VoR will appear below. Readers are then encourage to cite the VoR instead of this preprint.

Settings: If you are the author, you can login and change the preprint display settings, but the preprint URL/DOI is supposed to be stable and citable, so it should not be removed once posted.

Submit: To post your own preprint, simply submit to any JMIR journal, and choose the appropriate settings to expose your submitted version as preprint.

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Enhancing Healthcare Interoperability Using Large Language Models: A Generative Proof-of-Concept Framework to Extract Medical Information from Unstructured Clinical Text

  • Bahadır Eryılmaz; 
  • Kamyar Arzideh; 
  • Mikel Bahn; 
  • Hendrik Damm; 
  • Sina Warmer; 
  • Henning Schäfer; 
  • Ahmad Idrissi-Yaghir; 
  • Tabea Pakull; 
  • Lea Jessica Albrecht; 
  • Jens Kleesiek; 
  • Georg Lodde; 
  • Christoph M. Friedrich; 
  • Elisabeth Livingstone; 
  • Dirk Schadendorf; 
  • Katarzyna Borys; 
  • Felix Nensa; 
  • René Hosch

ABSTRACT

Background:

Unstructured clinical text remains a major barrier to interoperable data reuse and large-scale secondary analysis in healthcare. Large language models (LLMs) have the potential to automate the extraction of structured clinical information; however, their application is limited by the scarcity of high-quality annotated training data.

Objective:

-

Methods:

We evaluated an LLM–based pipeline for extracting structured clinical information from cancer-related discharge letters and mapping it to representations compatible with Fast Healthcare Interoperability Resources (FHIR). To enable large-scale supervised training, we developed a random sample generator that creates synthetic discharge letters using Qwen3 235B by randomly sampling and aggregating structured FHIR data from 41,175 cancer patients. The resulting synthetic discharge letters (n=75k) were paired with their originating structured data, forming a large-scale dataset for fine-tuning MedGemma 27B. Evaluation was conducted on the synthetic test dataset (n=7,500), real-world discharge letters (n=30) which are evaluated by physicians and a medical student, and a comparative one-shot approach using open-source models (Qwen3, LLaMA, and GPT-OSS).

Results:

The fine-tuned model achieved high extraction performance across multiple clinical entities, including full ICD diagnosis codes (F1 = 0.84), tumor-related information (0.99), laboratory values (0.99), medication names and dosages (0.99), and ATC medication codes (0.94). Extraction of procedure-related information was more challenging but remained reliable, with F1 scores of 0.63 for OPS codes and 0.90 for procedure descriptions. In a one-shot comparison of general-purpose LLMs with the fine-tuned model, the fine-tuned model consistently outperformed general-purpose LLMs in nearly all extraction categories. When applied to real-world discharge letters, performance remained robust, with F1 scores of 78.9% for ICD diagnoses, 86.1% for tumor-related information, 93% for medications, and 61.3% for procedures.

Conclusions:

These results demonstrate that synthetic text generation from structured clinical data enables effective and scalable training of LLMs for extracting interoperable, multi-entity clinical information from unstructured documentation.


 Citation

Please cite as:

Eryılmaz B, Arzideh K, Bahn M, Damm H, Warmer S, Schäfer H, Idrissi-Yaghir A, Pakull T, Albrecht LJ, Kleesiek J, Lodde G, Friedrich CM, Livingstone E, Schadendorf D, Borys K, Nensa F, Hosch R

Enhancing Healthcare Interoperability Using Large Language Models: A Generative Proof-of-Concept Framework to Extract Medical Information from Unstructured Clinical Text

JMIR Preprints. 29/01/2026:92413

DOI: 10.2196/preprints.92413

URL: https://preprints.jmir.org/preprint/92413

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.