Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Aug 29, 2023
Date Accepted: Feb 27, 2024

The final, peer-reviewed published version of this preprint can be found here:

Mining Clinical Notes for Physical Rehabilitation Exercise Information: Natural Language Processing Algorithm Development and Validation Study

Wang Y, Sivarajkumar S, Gao F, Denny P, Aldhahwani B, Visweswaran S, Bove A

Mining Clinical Notes for Physical Rehabilitation Exercise Information: Natural Language Processing Algorithm Development and Validation Study

JMIR Med Inform 2024;12:e52289

DOI: 10.2196/52289

PMID: 38568736

PMCID: 11024747

Mining Clinical Notes for Physical Rehabilitation Exercise Information: A Natural Language Processing Algorithm Development and Validation Study

  • Yanshan Wang; 
  • Sonish Sivarajkumar; 
  • Fengyi Gao; 
  • Parker Denny; 
  • Bayan Aldhahwani; 
  • Shyam Visweswaran; 
  • Allyn Bove

ABSTRACT

Background:

Precision rehabilitation holds promise in enhancing the physical capabilities of post-stroke patients through tailor-made therapy plans. Leveraging data from electronic health records (EHR), we can streamline the development of personalized rehabilitation strategies. Nevertheless, the majority of valuable physical exercise information are buried within unstructured clinical notes from physical therapy sessions, making it challenging to extract information to develop precision rehabilitation.

Objective:

This study aims to develop natural language processing (NLP) algorithms to extract physical rehabilitation exercise information from clinical notes of post-stroke patients.

Methods:

We identified a cohort of patients diagnosed with stroke at the University of Pittsburgh Medical Center and retrieved their clinical notes that contains rehabilitation therapy notes. We created a novel and comprehensive clinical ontology to represent physical rehabilitation exercise information, which covers type of motion, side of body, location on body, plane of motion, duration, information on sets and reps, exercise purpose, exercise type, and body position. We developed a variety of NLP algorithms leveraging the state-of-the-art techniques, including rule-based NLP algorithms, machine learning-based NLP algorithms (i.e., Support Vector Machine, Linear Regression, Gradient Boosting, and AdaBoost), and large language model (LLM)-based NLP algorithms (i.e., ChatGPT) for the extraction of physical rehabilitation exercise from clinical notes.

Results:

The experiments showed that the rule-based NLP algorithm had the best performance for extracting most of the physical rehabilitation exercise concepts. Among all machine learning models, GB achieved the best performance for a larger number of concepts than other models. The rule-based NLP performed well for extracting handled durations, sets, and reps while GB excelled in range of motion and location detection. LLM-based NLP achieved high recall with zero-shot and few-shot prompts but low precision and F1 scores. It occasionally outperformed simpler machine learning models and once beat the rule-based algorithm. Conclusion: In this study, we developed and evaluated several NLP algorithms to extract physical rehabilitation exercise information from clinical notes of post-stroke patients. Leveraging such information from the electronic health records will move forward in constructing predictive models that can recommend potential rehabilitation treatment options with the optimal outcomes, ultimately realizing precise rehabilitation.


 Citation

Please cite as:

Wang Y, Sivarajkumar S, Gao F, Denny P, Aldhahwani B, Visweswaran S, Bove A

Mining Clinical Notes for Physical Rehabilitation Exercise Information: Natural Language Processing Algorithm Development and Validation Study

JMIR Med Inform 2024;12:e52289

DOI: 10.2196/52289

PMID: 38568736

PMCID: 11024747

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.