Predictive performance of machine learning-based models for post-stroke clinical outcomes in comparison with conventional prognostic scores: a multicenter hospital-based observational study
ABSTRACT
Background:
Although machine learning is a promising tool for prognostication, the performance of machine learning in predicting outcomes after stroke remains to be examined.
Objective:
We aimed to examine how much data-driven models with machine learning improve predictive performance for post-stroke outcomes compared with conventional stroke prognostic scores and to elucidate how explanatory variables in the machine learning-based models differ from the items of the stroke prognostic scores.
Methods:
We used data from 10513 patients registered in a multicenter prospective stroke registry in Japan between 2007 and 2017. The outcomes were poor functional outcome (modified Rankin Scale score>2) and death at 3 months post-stroke. Machine learning-based models were developed using all variables with regularization methods, random forests, or boosted trees. We selected three stroke prognostic scores, namely ASTRAL (Acute STroke Registry and Analysis of Lausanne) score, PLAN (Preadmission comorbidities, Level of consciousness, Age, and Neurologic deficit) score, and iScore for comparison. Item-based regression models were developed using the items of these three scores. Model performance was assessed in terms of discrimination and calibration. To compare the predictive performance of the data-driven model with that of the item-based model, we performed internal validation after random splits of the identical populations into 80% patients as a training set and 20% patients as a test set: the models were developed in the training set and were validated in the test set. We evaluated the contribution of each variable to the models and compared the predictors used in the machine learning-based models with the items of stroke prognostic scores.
Results:
The mean (SD) age of study patients was 73.0 (12.5) years, and 59.1% of them were men. The area under the receiver operating characteristic curves and the area under the precision-recall curves for predicting post-stroke outcomes were higher for machine learning-based models than for item-based models in identical populations after random splits. Machine learning-based models also performed better than item-based models in terms of the Brier score. Machine learning-based models used different explanatory variables, such as laboratory data, from the items of the conventional stroke prognostic scores. Including these data in the machine learning-based model as explanatory variables improved performance in predicting outcomes after stroke, especially post-stroke death.
Conclusions:
Machine learning-based models performed better in predicting post-stroke outcomes than regression models using the items of conventional stroke prognostic scores, though they required additional variables, such as laboratory data, to attain improved performance. Further studies are warranted to validate the usefulness of machine learning in clinical settings.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.