Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Aug 29, 2023
Date Accepted: Feb 27, 2024
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Mining Clinical Notes for Physical Rehabilitation Exercise Information: A Step Toward Precision Rehabilitation
ABSTRACT
Objective This study aims to develop natural language processing (NLP) algorithms to extract physical rehabilitation exercise information from clinical notes of post-stroke patients. Methods We identified a cohort of patients diagnosed with stroke at the University of Pittsburgh Medical Center and retrieved their clinical notes that contains rehabilitation therapy notes. We created a novel and comprehensive clinical ontology to represent physical rehabilitation exercise information, which covers type of motion, side of body, location on body, plane of motion, duration, information on sets and reps, exercise purpose, exercise type, and body position. We developed a variety of NLP algorithms leveraging the state-of-the-art techniques, including rule-based NLP algorithms, machine learning-based NLP algorithms (i.e., Support Vector Machine, Linear Regression, Gradient Boosting, and AdaBoost), and large language model (LLM)-based NLP algorithms (i.e., ChatGPT) for the extraction of physical rehabilitation exercise from clinical notes. Results The experiments showed that the rule-based NLP algorithm had the best performance for extracting most of the physical rehabilitation exercise concepts. Among all machine learning models, GB achieved the best performance for a larger number of concepts than other models. The rule-based NLP performed well for extracting handled durations, sets, and reps while GB excelled in range of motion and location detection. LLM-based NLP achieved high recall with zero-shot and few-shot prompts but low precision and F1 scores. It occasionally outperformed simpler machine learning models and once beat the rule-based algorithm. Conclusion In this study, we developed and evaluated several NLP algorithms to extract physical rehabilitation exercise information from clinical notes of post-stroke patients. Leveraging such information from the electronic health records will move forward in constructing predictive models that can recommend potential rehabilitation treatment options with the optimal outcomes, ultimately realizing precise rehabilitation.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.