Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Oct 20, 2025
Open Peer Review Period: Oct 20, 2025 - Dec 15, 2025
Date Accepted: May 18, 2026
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Extracting and Classifying Drug Discontinuations From Estonian Electronic Health Records: Development and Validation Study

Šuvalov H, Umov N, Haug M, Laur S, Oja M, Tamm S, Reisberg S, Vilo J, Kolde R

Extracting and Classifying Drug Discontinuations From Estonian Electronic Health Records: Development and Validation Study

J Med Internet Res 2026;28:e86183

DOI: 10.2196/86183

PMID: 42308511

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Extracting and Classifying Drug Discontinuations from Estonian Electronic Health Records: Development and Validation Study

  • Hendrik Šuvalov; 
  • Nikita Umov; 
  • Markus Haug; 
  • Sven Laur; 
  • Marek Oja; 
  • Sirli Tamm; 
  • Sulev Reisberg; 
  • Jaak Vilo; 
  • Raivo Kolde

ABSTRACT

Background:

Drug adherence is crucial for chronic disease management, yet treatment discontinuation remains common due to factors such as side effects, inefficacy, or cost. These reasons are often recorded only in free-text clinical notes, making large-scale analysis difficult. While large language models (LLMs) can interpret such unstructured data more effectively than traditional natural language processing methods, few studies have systematically categorized reasons for discontinuation or identified whether the decision was initiated by the patient or the clinician, especially in low-resource languages like Estonian.

Objective:

To assess the ability of LLMs to extract and classify reasons for drug discontinuation and identify who initiated it using Estonian electronic health records, and to characterize the observed discontinuation patterns and initiators for statins and antidiabetic medications.

Methods:

We combined prescription data with free-text anamneses from a 10% sample of the Estonian population (2012–2019). LLMs (Llama-3.1-70B and GPT-4o) were applied to extract discontinuation phrases and reasons, classify them into a clinician-developed taxonomy, and identify who discontinued the treatment. Performance was evaluated on randomly chosen 100 cases per drug group.

Results:

Extraction yielded 625 antidiabetic drug and 233 statin discontinuation cases. Validation confirmed high accuracy, with 93–98% of extracted phrases and 95–96% of extracted reasons judged correct. Classification of discontinuation reasons achieved weighted F1 scores of 0.808–0.836, while classification of who initiated discontinuation achieved weighted F1 scores of 0.645–0.774. Adverse reactions were the most frequent reason overall, accounting for ~70% of discontinuations for statins and ~44% for antidiabetic drugs. For antidiabetic drugs, treatment inefficacy and contraindications were more common. Patients more often stopped due to adverse reactions or non-medical reasons, while physicians more often initiated discontinuation for contraindications.

Conclusions:

LLMs can accurately extract and classify medication discontinuation reasons and initiators from Estonian clinical narratives. Both local and proprietary models performed well, enabling scalable analyses that complement structured health records. This demonstrates the potential of LLMs to unlock information from clinical notes, turning this underutilized EHR component into a valuable resource for monitoring treatment patterns and detecting adverse event signals.


 Citation

Please cite as:

Šuvalov H, Umov N, Haug M, Laur S, Oja M, Tamm S, Reisberg S, Vilo J, Kolde R

Extracting and Classifying Drug Discontinuations From Estonian Electronic Health Records: Development and Validation Study

J Med Internet Res 2026;28:e86183

DOI: 10.2196/86183

PMID: 42308511

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.