Accepted for/Published in: JMIR Formative Research
Date Submitted: Jan 30, 2023
Date Accepted: Nov 20, 2023
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Leveraging social media data to inform healthcare supply-chain decisions with COVID-19 as a case study: A sentiment analysis and topic modelling approach
ABSTRACT
Background:
The COVID-19 pandemic has exposed major flaws in the PPE supply chain and healthcare facilities are at high risk of becoming overwhelmed beyond their capacity. Therefore, it is paramount to develop new technologies that can be leveraged as an early warning and detection system.
Objective:
In this study, Twitter sentiment analysis was utilized for two objectives. One was to determine if an increase in tweets pertaining to PPE shortages correspond to an actual increase in PPE shortages in long-term care facilities in California. The second objective was to apply the same methodology toward medical supply shortage prediction in the context of developing countries; namely, to determine if negative sentiment tweets pertaining to COVID-19 in Brazil and India correspond to a greater use of hospital Intensive Care Unit (ICU) beds in these countries.
Methods:
Two time series were created including frequency of negative sentiment tweets (extracted using VADER) and ground truth frequency of medical resource shortages or demands. Then, the Granger causality test was used to determine if the time series for tweets can be useful for forecasting medical resource shortages.
Results:
The sample size for the negative sentiment tweet analysis was 6,970 tweets pertaining to California, 24,1105 tweets for the analysis of ICU bed demand in Brazil, and 8,613,049 tweets for India. For California, the results of the Granger test were significant at lag 2 (P = 0.035) and lag 5 (P = 0.005). For Brazil, the Granger test was significant for six of the 25 regions which passed the Augmented-Dickey Fuller test: Amazonas (P = 0.039, lag 4), Bahia (P = 0.019, lag 1), Federal District (P = 0.010, lag 1), Espírito Santo (P = 0.006, lag 3), Roraima (P = 0.030, lag 1), and São Paulo (P = 0.013, lag 1). For India, the results of the Granger test were significant for ten of the 27 regions which passed the Augmented-Dickey Fuller test: Tripura (P = 0.020, lag 1), Gujarat (P = 0.023 , lag 3), West Bengal (P = 0.008, lag 2), Haryana (P = 0.045, lag 1), Bihar (P = 0.126, lag 1), Karnataka (P = 0.037, lag 3), Odisha (P = 0.048, lag 4), Andhra Pradesh (P = 0.037, lag 4), Jharkhand (P = 0.0004, lag 1), and Himachal Pradesh (P = 0.020, lag 3).
Conclusions:
This study provides a novel approach for identifying regions of PPE shortage and high hospital bed demand by analyzing Twitter sentiment data given that Twitter can be used as a useful tool for rapidly organizing relief efforts. Natural language processing-driven Tweet extraction systems have the potential to be an effective method that allows for early detection of medical resource demand surges.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.