Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Aug 2, 2020
Date Accepted: Jan 16, 2021
Prediction of the prolonged length of hospital stay after cancer surgery using the machine learning on electronic health records: Retrospective cross-sectional study
ABSTRACT
Background:
Postoperative length of stay is a key indicator in the management of medical resources and an indirect parameter of the incidence of surgical complications and recovery of systemic conditions in cancer surgery. To our knowledge, machine learning models have not been used to predict prolonged length of stay after cancer surgery using extensive medical information.
Objective:
To develop a prediction model for prolonged length of stay after cancer surgery using a machine learning approach.
Methods:
In our retrospective study, electronic medical records (EHR) of 42,751 patients who underwent primary surgery for 17 types of cancer from January 1, 2000 to December 31, 2017, sourced from a single cancer center, were used. Those records include various variables such as surgical factors, cancer factors, underlying diseases, functional laboratory assessments, general assessments, medications, and social factors. To predict prolonged length of stay after cancer surgery, we employed extreme gradient boosting classifier, multiple layer perceptron, and logistic regression models. Prolonged postoperative length of stay for cancer is defined as bed-days of the group accounting for top 50% of the distribution of bed-days by cancer type.
Results:
In the prediction of prolonged length of stay after cancer surgery, extreme gradient boosting classifier models demonstrate excellent performance for kidney and bladder cancer surgeries (area under the receiver operating characteristic curve (AUC) > 0.85). A moderate performance (AUC: 0.70–0.85) was observed for stomach, breast, colon, thyroid, prostate, cervix uteri, corpus uteri, and oral cancers. For stomach, breast, colon, thyroid, and lung cancers, with more than 4000 cases, the extreme gradient boosting classifier model outperformed the other models. We identified risk variables for the prediction of prolonged postoperative length of stay for each cancer, and the importance of the variables differed depending on the cancer type. After we added operative time to the models trained on preoperative factors, the models generally outperformed the corresponding models using only preoperative variables.
Conclusions:
A machine learning approach using EHR may improve the prediction of prolonged length of stay after primary cancer surgery. This algorithm may help in a more effective allocation of medical resources in cancer surgery. Clinical Trial: This study was approved by the institutional review board of the National Cancer Center-Korea, with a waiver for written informed consent (NCC-2018-0113).
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.