Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Oct 29, 2024
Date Accepted: Mar 28, 2025
TOO-BERT: A Trajectory Order Objective BERT for self-supervised representation learning of temporal healthcare data
ABSTRACT
Background:
The growing availability of Electronic Health Records (EHRs) presents an opportunity to enhance patient care by uncovering hidden health risks and improving informed decisions through advanced deep learning methods. However, modeling EHR sequential data, denoted patient trajectories, is complex due to the evolving relationships between diagnoses and treatments over time, where medical conditions and interventions alter the likelihood of future health outcomes over time. While BERT-inspired models have shown promise in modeling EHR sequences by pretraining on the masked language modeling (MLM) objective, they struggle to fully capture the intricate, temporal dynamics of disease progression and medical interventions.
Objective:
In this study, we introduce TOO-BERT, a novel adaptation that enhances MLM-pretrained transformers by explicitly incorporating temporal information from patient trajectories.
Methods:
TOO-BERT encourages the model to learn complex causal relationships between diagnoses and treatments using a new self-supervised learning task, the Temporal Order Objective (TOO). This is achieved through two proposed methods: Conditional Code Swapping (CCS) and Conditional Visit Swapping (CVS).
Results:
We evaluate TOO-BERT on two datasets, MIMIC-IV hospitalization records and the Malmö Diet cohort—comprising approximately 10 and 8 million medical codes, respectively. TOO-BERT demonstrates superior performance in predicting Heart Failure (HF), Alzheimer's Disease (AD), and Prolonged Length of Stay (PLS) compared to standard MLM-pretrained transformers, and notably excels in HF prediction even with limited fine-tuning data.
Conclusions:
Our results underscore the effectiveness of integrating temporal ordering objectives into MLM-pretrained models, enabling deeper insights into the complex relationships in EHR data. Attention analysis further reveals TOO-BERT’s ability to capture and represent sophisticated structural patterns within patient trajectories.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.