Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: May 20, 2025
Date Accepted: Nov 27, 2025
(closed for review but you can still tweet)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Detection of antithrombotic-related bleeding in older inpatients using structured and unstructured electronic medical record data: A multicentre cross-sectional study in Switzerland
ABSTRACT
Background:
Bleeding complications are a major contributor to adverse drug events (ADEs) among older inpatients, particularly in those treated with antithrombotic agents. Timely and accurate detection of bleeding events is essential for improving drug safety surveillance and clinical risk management.
Objective:
This study aimed to develop and validate an automated detection system for major bleeding (MB) and clinically relevant non-major bleeding (CRNMB) using both structured and unstructured electronic medical record (EMR) data.
Methods:
We conducted a retrospective multicentre study using EMR data from three Swiss university hospitals. Patients aged ≥65 years who received one or more antithrombotic agents and were hospitalised between January 2015 and December 2016 were included. Bleeding events were defined based on the International Society on Thrombosis and Haemostasis (ISTH) criteria. Rule-based algorithms were developed using structured data (ICD-10-GM codes, laboratory values, transfusion records, antihaemorrhagic prescriptions). Natural Language Processing (NLP) was applied to discharge summaries in one hospital. A manual review of 754 EMRs served as the reference standard. Algorithm performance was assessed using sensitivity, specificity, predictive values, and F1-score, with priority given to sensitivity.
Results:
Among 36 039 inpatient stays, structured data algorithms identified 8.3% MB and 15.0% CRNMB cases. ICD-10-GM codes alone detected 28.5% of MB and 31.5% of CRNMB cases; laboratory data contributed most (67%). Integrating SDA with NLP improved detection, identifying 12.2% MB and 27.4% CRNMB cases in one hospital. The best-performing model, SDA combined with NLP, achieved a sensitivity of 0.84, PPV of 0.51, and an F1-score of 0.64 for overall bleeding detection.
Conclusions:
Our integrated approach, combining structured data algorithm with NLP, enhances the detection of haemorrhagic events in older hospitalised patients treated with antithrombotic agents, providing a valuable tool for drug safety monitoring and clinical risk management.
Citation