Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Apr 7, 2021
Open Peer Review Period: Apr 7, 2021 - Jun 2, 2021
Date Accepted: Feb 9, 2022
(closed for review but you can still tweet)
Optimal policy determination in sequential systemic and locoregional therapy of oropharyngeal squamous carcinomas: A patient-physician digital twin dyad with deep Q-learning for treatment selection
ABSTRACT
Purpose: Currently, selection of patients for sequential vs. concurrent chemotherapy/radiation regimens lacks evidentiary support, and it is based on locally-optimal decisions for each step. We aim to optimize the multi-step treatment of head and neck cancer patients and to predict multiple patient survival and toxicity outcomes, and we develop, apply, and evaluate a first application of deep-Q-learning (DQL) and simulation to this problem. Patients and methods: The treatment decision DQL digital twin and the patient’s digital twin were created, trained and evaluated on a dataset of 536 oropharyngeal squamous cell carcinoma (OPC) patients with the goal of, respectively, determining the optimal treatment decisions with respect to survival and toxicity metrics, and predicting the outcomes of the optimal treatment on the patient. The models were trained on a subset of 402 patients (split randomly) and evaluated on a separate set of 134 patients. Training and evaluation of the digital twin dyad was completed in August 2020. The dataset includes 3-step sequential treatment decisions and complete relevant history of the patients cohort treated at MD Anderson Cancer Center between 2005 and 2013, with radiomics analysis performed for the segmented primary tumor volumes.
Results:
On the validation set, 87.09% mean and 90.85% median accuracy in treatment outcome prediction, matching the clinicians’ outcomes and improving (predicted) survival rate by +3.73% (95% CI: [-0.75%, +8.96%]), and dysphagia rate by +0.75% (CI: [-4.48%, +6.72%]) when following DQL treatment decisions. Conclusion: Given the prediction accuracy and predicted improvement on medically relevant outcomes yielded by this approach, this digital twin dyad of the patient-physician dynamic treatment problem has the potential of aiding physicians in determining the optimal course of treatment and in assessing its outcomes.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.