Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Jun 25, 2020
Open Peer Review Period: Jun 25, 2020 - Jul 14, 2020
Date Accepted: Jul 27, 2020
(closed for review but you can still tweet)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
AutoScore: A Machine Learning-Based Automatic Clinical Score Generator and Its Application to Mortality Prediction Using Electronic Health Records
ABSTRACT
Background:
Risk scores can be useful in clinical risk stratification and accurate allocations of medical resources, helping health providers improve patient care. Point-based scores are more understandable and explainable than other complex models and now widely used in clinical decision making. However, the development of the risk scoring model is non-trivial and not systematically presented yet, with few studies investigating methods of clinical score generation using electronic health records.
Objective:
To achieve this, we aimed to propose AutoScore, a machine learning-based automatic clinical score generator consisting of six modules for developing interpretable point-based scores. Future users can employ the AutoScore framework to create clinical scores effortlessly in various clinical applications.
Methods:
We proposed the AutoScore framework, comprising of six modules that included variable ranking, variable transformation, score derivation, model selection, score fine-tuning, and model evaluation. To demonstrate the performance of the AutoScore, we used data from Beth Israel Deaconess Medical Center (BIDMC) to build a scoring model for mortality prediction. We then compared it with other baseline models by the receiver operating characteristic (ROC) analysis. We also developed an R software package to demonstrate the implementation of AutoScore.
Results:
Implemented on the dataset with 44,918 individual admission episodes of intensive care, the scoring models generated by the AutoScore performed comparably well as other standard methods (i.e., logistic regression, stepwise regression, LASSO and random forest) in terms of predictive accuracy and model calibration, but required fewer predictors, and presented high interpretability and accessibility. The 9-variable AutoScore-created point-based scoring model achieved an AUC of 0.780 (95% confidence interval [CI]: 0.764-0.798), while the model of logistic regression with 24 variables had an AUC of 0.778 (95% CI: 0.760-0.795). Moreover, the AutoScore framework also drives the clinical research continuum and automation with its integration of all necessary modules.
Conclusions:
We developed an easy-to-use, automatic clinical score generator, AutoScore, systematically presented its structure, and demonstrated its superiority (predictive performance and interpretability) over other conventional methods using a benchmark database. AutoScore will emerge as a potential scoring tool in various medical applications.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.