Developing a tool for identifying clinical risk from free text clinical records using natural language processing and machine learning
ABSTRACT
Background:
Electronic patient records (EPR) are an under-utilised yet valuable data source that has been extensively explored through research using natural language processing (NLP).
Objective:
This study applied NLP to create a risk identification tool capable of discerning high and low-risk veterans using EPR from a UK veteran mental health charity.
Methods:
A total of 20,342 notes were extracted for this purpose. To develop the risk tool, 70% of the records formed the training dataset, while the remaining 30% were allocated for testing and evaluation. The classification framework was devised and trained to categories risk into a binary outcome: 1 for high risk, and 0 for low risk.
Results:
The efficacy of each classifier model was assessed by comparing its results with those from clinical risk assessments. This comparison allowed for the calculation of the positive predictive value, negative predictive value (0.73, 95% CI [0.71 to 0.75]), sensitivity (0.75, 95% CI [0.74 to 0.76]), F1 score (0.74, 95% CI [0.72 to 0.76]), and accuracy, which was measured using the Youden Index (0.73, 95% CI [0.71 to 0.76]).
Conclusions:
The risk identification tool successfully determined the correct risk category of veterans from a large sample of clinical notes. Future studies should investigate whether this tool can detect more nuanced differences in risk.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.