Accepted for/Published in: JMIR Public Health and Surveillance
Date Submitted: Feb 15, 2021
Date Accepted: Apr 27, 2021
Anthrax on Twitter: Analysis of Public Discussion of Anthrax Over Twelve Months of Data Collection
ABSTRACT
Background:
A computational framework that utilizes machine learning methodologies was created to collect tweets discussing anthrax, further categorize them as relevant by month of data collection and detect anthrax related events.
Objective:
The objective of this study was to detect anthrax related events and to determine the relevancy of the tweets and topics of discussion over twelve months of data collection.
Methods:
Machine learning techniques were used to determine what people were tweeting about anthrax. Data over time was graphed to see if an event was detected (a three-fold spike in tweets). A machine learning classifier was created to categorize tweets as relevant. Relevant tweets by month were examined using a topic modeling approach to determine the topics of discussion over time and how events influence that discussion.
Results:
Over the twelve months of data collection 204,008 tweets were collected. Logistic regression performed best for relevancy (precision=0.81, recall=0.81, and F1-score=0.80). Twenty-six topics were found relating to anthrax events, tweets that were highly re-tweeted, natural outbreaks, and news stories.
Conclusions:
This study demonstrated that tweets relating to anthrax can be collected and analyzed over time to determine what people are discussing and detect key anthrax-related events. Future studies can focus on opinion tweets only, use the methodology to study other terrorism events, or use the methodology to monitor for threats.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.