Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Mar 29, 2021
Open Peer Review Period: Mar 29, 2021 - Apr 6, 2021
Date Accepted: May 16, 2021
Date Submitted to PubMed: May 17, 2021
(closed for review but you can still tweet)
Predicting outcomes in the Machine Learning era: The Piacenza score a purely data driven approach for mortality prediction in COVID-19 Pneumonia
ABSTRACT
Background:
Several models have been developed to predict mortality in patients with Covid-19 pneumonia, but only few have demonstrated enough discriminatory capacity. Machine-learning algorithms represent a novel approach for data-driven prediction of clinical outcomes with advantages over statistical modelling.
Objective:
To developed the Piacenza score, a Machine-learning based score, to predict 30-day mortality in patients with Covid-19 pneumonia
Methods:
The study comprised 852 patients with COVID-19 pneumonia, admitted to the Guglielmo da Saliceto Hospital (Italy) from February to November 2020. The patients’ medical history, demographic and clinical data were collected in an electronic health records. The overall patient dataset was randomly splitted into derivation and test cohort. The score was obtained through the Naïve Bayes classifier and externally validated on 86 patients admitted to Centro Cardiologico Monzino (Italy) in February 2020. Using a forward-search algorithm six features were identified: age; mean corpuscular haemoglobin concentration; PaO2/FiO2 ratio; temperature; previous stroke; gender. The Brier index was used to evaluate the ability of ML to stratify and predict observed outcomes. A user-friendly web site available at (https://covid.7hc.tech.) was designed and developed to enable a fast and easy use of the tool by the final user (i.e., the physician). Regarding the customization properties to the Piacenza score, we added a personalized version of the algorithm inside the website, which enables an optimized computation of the mortality risk score for a single patient, when some variables used by the Piacenza score are not available. In this case, the Naïve Bayes classifier is re-trained over the same derivation cohort but using a different set of patient’s characteristics. We also compared the Piacenza score with the 4C score and with a Naïve Bayes algorithm with 14 features chosen a-priori.
Results:
The Piacenza score showed an AUC of 0.78(95% CI 0.74-0.84 Brier-score 0.19) in the internal validation cohort and 0.79(95% CI 0.68-0.89, Brier-score 0.16) in the external validation cohort showing a comparable accuracy respect to the 4C score and to the Naïve Bayes model with a-priori chosen features, which achieved an AUC of 0.78(95% CI 0.73-0.83, Brier-score 0.26) and 0.80(95% CI 0.75-0.86, Brier-score 0.17) respectively.
Conclusions:
A personalized Machine-learning based score with a purely data driven features selection is feasible and effective to predict mortality in patients with COVID-19 pneumonia.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.