Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Apr 13, 2023
Date Accepted: Dec 20, 2023
Using natural language processing to explore social media opinions around food security: A sentiment analysis and topic modelling study
ABSTRACT
Background:
The use of social media data for public health research is emerging, which can be of great value to understanding patterns in public health given that research in this area has traditionally employed small-scale manual analysis. To explore the breadth of data in social media and gain a better understanding of meaning behind the data, it is necessary to use analysis approaches (e.g. data science and natural language processing) that can explore large datasets. Two such analysis methods that have been used in public health are sentiment analysis and topic modelling; however, their use in the area of food (in)security public health nutrition is limited.
Objective:
To explore the potential use of NLP tools to gather insight from real-world social media data around the public health issue of food security.
Methods:
A search strategy for obtaining tweets was developed using food (in)security terms. Tweets were collected using the Twitter application programming interface from 1 January 2019 to 31 December 2021 filtered for Australia-based users only. Sentiment analysis of the tweets was undertaken using Valence Aware Dictionary and sEntiment Reasoner. Topic modelling to explore the content of tweets was conducted using Latent Dirichlet Allocation with BigML. Sentiment, topic and engagement (sum of likes, re-tweets, quotations and replies) were compared across years.
Results:
A total of 38,070 tweets were collected from 14,880 unique Twitter users, with a larger proportion of tweets being posted in 2020 than in the other years. Overall, sentiment was positive, although this varied when assessed by month across the three years. Positive sentiment remained higher during the COVID-19 lockdown periods in Australia. The topic model contained 10 topics (in order of probability in the dataset): ‘Global production’, ‘Food insecurity and health’, ‘Use of food banks’, ‘Giving to food banks’, ‘Family poverty’, ‘Food relief provision’, ‘Global food insecurity’, ‘Climate change’, ‘Australian food insecurity’ and ‘Human rights’. ‘Giving to food banks’, which focused on support and donation had the highest proportion of positive sentiment and ‘Global food insecurity’, which covered food insecurity prevalence worldwide had the highest proportion of negative sentiment. Negative tweets received significantly higher engagement across 2019 and 2020. There was no clear relationship between topics that were more likely to be positive or negative and higher or lower engagement, indicating that identified topics are discrete issues.
Conclusions:
In this study, we demonstrated the potential use of sentiment analysis and topic modelling to explore evolution in sentiment and key topics around food security using social media data. Future use of natural language processing in food security requires the context of, and interpretation by, public health experts with the potential to track dimensions or events related to food security to inform evidence-based decision making in this area.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.