Accepted for/Published in: JMIR Formative Research
Date Submitted: Jun 1, 2023
Date Accepted: Apr 3, 2024
Identifying X (Formerly Twitter) Posts Relevant to Dementia and COVID-19: A Machine Learning Approach
ABSTRACT
Background:
Background:
During the pandemic, dementia patients were identified as a vulnerable population. Twitter became an important source of information for people seeking updates on COVID-19, and therefore, identifying tweets relevant to dementia can be an important support for dementia patients and their caregivers. However, mining and coding relevant tweets can be daunting due to the sheer volume and high percentage of irrelevant tweets.
Objective:
Objective:
The objective of this study was to automate the identifying tweets relevant to dementia and COVID-19 using natural language processing (NLP) and machine learning (ML) algorithms.
Methods:
We employed a combination of NLP and ML algorithms with manually annotated tweets to identify tweets relevant to dementia and COVID-19. We utilized three datasets containing more than 100,000 tweets and assessed the capability of various ML algorithms in correctly identifying relevant tweets.
Results:
Our results showed that (pre-trained) transfer learning algorithms outperformed traditional ML algorithms in identifying tweets relevant to dementia and COVID-19. Among the algorithms tested, the transfer learning algorithm ALBERT achieved an accuracy of 0.8292 and an AUC of 0.8353. ALBERT substantially outperformed the other algorithms tested, further emphasizing the superior performance of transfer learning algorithms for tweet classification.
Conclusions:
Transfer learning algorithms like ALBERT are highly effective in identifying topic-specific tweets, even when trained with limited or adjacent data, highlighting their superiority over other ML algorithms. Such an automated approach reduces the workload of manual coding of tweets and facilitates their analysis for researchers and policymakers to support dementia patients and their caregivers.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.