Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: JMIR Medical Informatics

Date Submitted: Jun 1, 2026
Open Peer Review Period: Jun 17, 2026 - Aug 12, 2026
(currently open for review)

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Pain Severity Classification From Electronic Health Record in Children With Cerebral Palsy: Retrospective Study

  • Fahmida Yasmin Rifat; 
  • Thasina Tabashum; 
  • Joseph J Krzak; 
  • Karen M Kruger; 
  • Adam Graf; 
  • Ross S Chafetz; 
  • Jon R Davids; 
  • Anita Bagley; 
  • Susan E Sienko; 
  • Jeremy P Bauer; 
  • Mark V Albert

ABSTRACT

Background:

Pain is common and clinically consequential in children with cerebral palsy (CP), but retrospective pain severity assessment from electronic health records (EHRs) is difficult. Pain evidence is often dispersed across long narrative notes, many children have limited ability to self-report, and structured pain scores may be influenced by proxy reporting and encounter context.

Objective:

This study introduces the first systematic benchmark for three-class pain severity classification from clinical notes of children with CP and evaluates whether modern Natural Language Processing (NLP) and large language model approaches can recover clinically meaningful severity signals from real-world EHR documentation.

Methods:

We conducted a retrospective study using 15,969 de-identified clinical notes from 1,467 children with CP treated across the Shriners Children’s Hospital Network over 19 years. Notes were linked to encounter-level numeric pain ratings and grouped into No/Mild (0–3), Moderate (4–6), and Severe (7–10) classes. Using patient-stratified splits, we compared supervised encoder-fusion models, explanation-augmented local language models, frozen instruction-following LLMs, and a late-fusion ensemble.

Results:

Performance improved as models incorporated broader context and complementary decision patterns. The best supervised encoder-fusion model achieved a macro F1 of 0.600, while explanation-augmented local LLMs improved to 0.620 when clinical notes were combined with generated explanations. Few-shot GPT-5 was the strongest standalone model in this held-out test set, achieving a macro F1 of 0.644. The best overall performance was achieved by a late-fusion ensemble of Bio-Mistral-7B, zero-shot GPT-5, and few-shot GPT-5, reaching a macro F1 of 0.66. SHapley Additive exPlanations (SHAP) analysis showed that No/Mild predictions were driven mainly by negated pain language, whereas Moderate and Severe predictions were associated with explicit pain terms and severity cues.

Conclusions:

Modern NLP and LLM-based approaches can extract clinically meaningful pain severity signals from real-world EHR notes of children with CP. Performance was strongest when models used long-context modeling, explanation-guided supervision, and complementary prediction patterns. Remaining errors reflected the subjective, proxy-mediated, and temporally variable nature of pain documentation. Therefore, these models should support retrospective pain research, cohort identification, and documentation review. To our knowledge, this is the first study to benchmark multi-level pain severity classification from unstructured EHR notes of children with CP.


 Citation

Please cite as:

Rifat FY, Tabashum T, Krzak JJ, Kruger KM, Graf A, Chafetz RS, Davids JR, Bagley A, Sienko SE, Bauer JP, Albert MV

Pain Severity Classification From Electronic Health Record in Children With Cerebral Palsy: Retrospective Study

JMIR Preprints. 01/06/2026:103165

DOI: 10.2196/preprints.103165

URL: https://preprints.jmir.org/preprint/103165

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.