Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Jul 24, 2025
Date Accepted: Apr 20, 2026
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Automated Glasgow Coma Scale Score Extraction: Mining Unstructured Electronic Health Records
ABSTRACT
Background:
Multicenter electronic health records (EHR) can support quality improvement and comparative effectiveness research in critical care. However, limitations of EHR-based research include challenges in abstracting key clinical variables, including a patient’s level of consciousness.
Objective:
The objective of our study was to develop a natural language processing (NLP) model to predict the Glasgow Coma Scale (GCS) scores from daily EHR notes.
Methods:
The study included adult patients (≥18 years) admitted to Massachusetts General Brigham (MGB) hospitals (2017-2024) and patients from the MIMIC-III database (Medical Information Mart for Intensive Care-MIMIC III 2001-2012) v1.4. A dataset with daily notes, age, sex, admission type, of all patients from both institutions was split into train/hold-out test (70%/30%) sets. We trained an ordinal regression model “ordinalNet” with an elastic net penalty to predict the lowest daily score among three levels: severe (GCS 3-8), moderate (GCS 9-12) and mild (GCS 13-15). Model performance was assessed in the hold-out test set (MGB+MIMIC) using areas under the receiver characteristic curve (AUROC) and precision-recall curve (AUPRC).
Results:
Our modeling cohort included 55,285 patients (MGB =36,696; MIMIC =18,589) with 122,010 days of hospitalization; average age 64 [SD 17] years; 56% male, and 76% White. The ordinalNet achieved AUROC and AUPRC [95% CI]: MGB + MIMIC – 0.91 [0.91-0.91] and 0.84 [0.83-0.84]; MGB – 0.91 [0.90-0.91] and 0.83 [0.82-0.84]; MIMIC –0.91 [0.90-0.91] and 0.83 [0.83-0.84]. The model predicted severe GCS 3-8 with AUROC and AUPRC of 0.97 [0.97-0.97] and 0.94 [0.93-0.94].
Conclusions:
Our NLP-based model can enable large-scale phenotyping of neurological assessments and critical care research studies.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.