Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Dec 15, 2022
Open Peer Review Period: Dec 15, 2022 - Feb 9, 2023
Date Accepted: Jun 3, 2023
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Identifying Risk Factors Associated With Lower Back Pain in Electronic Medical Record Free Text: Deep Learning Approach Using Clinical Note Annotations

Katz A, Jaiswal A, Nesca M, Milios E

Identifying Risk Factors Associated With Lower Back Pain in Electronic Medical Record Free Text: Deep Learning Approach Using Clinical Note Annotations

JMIR Med Inform 2023;11:e45105

DOI: 10.2196/45105

PMID: 37584559

PMCID: 10461403

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

A Deep Learning Approach To Identifying Risk Factors Associated With Lower Back Pain In Electronic Medical Record Free Text

  • Alan Katz; 
  • Aman Jaiswal; 
  • Marcello Nesca; 
  • Evangelos Milios

ABSTRACT

Background:

Lower back pain is a common weakening condition that affects a large population. It is a leading cause of disability and lost productivity, and the associated medical costs and lost wages place a significant burden on individuals and society. Recent advances in artificial intelligence (AI) and natural language processing (NLP) have opened new opportunities for the identification and management of risk factors for lower back pain. In this paper, we propose and train a deep learning model on a dataset of clinical notes that have been annotated with relevant risk factors, and we evaluate the model's performance in identifying risk factors in new clinical notes.

Objective:

The primary objective is to develop a novel deep learning approach to detect risk factors for underlying disease in patients presenting with lower back pain in clinical encounter notes. The secondary objective is to propose solutions to potential challenges of using deep learning and NLP techniques for identifying risk factors in EMR free text and make practical recommendations for future research in this area.

Methods:

We manually annotated clinical notes for the presence of six risk factors for severe underlying disease in patients presenting with lower back pain. Data was highly imbalanced, with only 12% of the annotated notes having at least one label. To address imbalanced data, a combination of semantic matching and regular expressions was used to further capture more notes to annotate. Further analysis was conducted to study the impact of down-sampling, binary formulation of multi-label classification and unsupervised pre-training on classification performance. Lastly, the proposed BERT-based model was compared using original BERT baselines for detecting lower back pain risk factors.

Results:

Of 2350 clinical notes labeled, 347 had at least one label, while 2402 had no labels. Down-sampling the training set to equalize the ratio of clinical notes with and without risk factors improved the average AUC by 21% for the BERT baseline. The proposed BERT-based model performed 3% better than the BERT baseline in multi-task learning. Unsupervised pre-training using causal language modeling on clinical notes can further improve performance by 1%.

Conclusions:

Primary care clinical notes are likely to require manipulation to perform meaningful free-text analysis. The application of BERT Transformer models for multi-label classification on down-sampled annotated clinical notes is useful in detecting risk factors suggesting an indication for imaging for patients with lower back pain.


 Citation

Please cite as:

Katz A, Jaiswal A, Nesca M, Milios E

Identifying Risk Factors Associated With Lower Back Pain in Electronic Medical Record Free Text: Deep Learning Approach Using Clinical Note Annotations

JMIR Med Inform 2023;11:e45105

DOI: 10.2196/45105

PMID: 37584559

PMCID: 10461403

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.