Accepted for/Published in: JMIR AI
Date Submitted: Dec 2, 2022
Open Peer Review Period: Dec 2, 2022 - Jan 27, 2023
Date Accepted: Apr 1, 2023
(closed for review but you can still tweet)
A scalable radiomics- and NLP- based machine learning pipeline to distinguish between painful and painless thoracic spinal bone metastases: Algorithm Development and Validation
ABSTRACT
Background:
The identification of objective pain biomarkers can contribute to an improved understanding of pain, as well as its prognosis and better management. Hence, it has the potential to improve the quality of life of cancer patients. Artificial intelligence can aid in the extraction of objective pain biomarkers for cancer patients with bone metastases.
Objective:
To develop and evaluate a scalable natural language processing (NLP) and radiomics-based machine learning pipeline to differentiate between painless and painful bone metastasis (BM) lesions in simulation-CT images using imaging features (biomarkers) extracted from lesion-centerpoint-based regions of interest (ROIs).
Methods:
Patients treated at our comprehensive cancer center who received palliative radiotherapy for thoracic spine BM between January 2016 and September 2019 were included in this retrospective study. Physician-reported pain scores were extracted automatically from radiation oncology consultation notes using an NLP pipeline. BM centerpoints were manually pinpointed on CT images by radiation oncologists. Nested ROIs with various diameters were automatically delineated around these expert-identified BM centerpoints, and radiomics features were extracted from each ROI. The Synthetic Minority Oversampling Technique re-sampling technique, the Least Absolute Shrinkage And Selection Operator feature selection method, and various machine learning classifiers were evaluated using precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC).
Results:
Radiation therapy consultation notes and simulation-CT images of 176 (mean age ± SD, 66 ± 14 y; 95 male) thoracic spine BM patients were used in this study. After BM centerpoint identification, 107 radiomics features were extracted from each spherical ROI using pyradiomics. Data were divided into 70% and 30% training and hold-out test sets, respectively. In the test set, the accuracy, sensitivity, specificity, and AUC of our best performing model (Neural Network classifier on an ensemble ROI) were 0.82 (132 of 163), 0.59 (16 of 27), 0.85 (116 of 136), and 0.83, respectively.
Conclusions:
Our NLP and radiomics-based machine learning pipeline was successful in differentiating between painful and painless BM lesions. It is intrinsically scalable by using NLP to extract pain scores from clinical notes and by requiring just center points to identify BM lesions in CT images.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.