Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Formative Research

Date Submitted: Jun 1, 2023
Date Accepted: Apr 3, 2024

The final, peer-reviewed published version of this preprint can be found here:

Identifying X (Formerly Twitter) Posts Relevant to Dementia and COVID-19: Machine Learning Approach

Azizi M, Jamali AA, Spiteri RJ

Identifying X (Formerly Twitter) Posts Relevant to Dementia and COVID-19: Machine Learning Approach

JMIR Form Res 2024;8:e49562

DOI: 10.2196/49562

PMID: 38833288

PMCID: 11185906

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Identifying Tweets Relevant to Dementia and COVID-19: A Machine Learning Approach

  • Mehrnoosh Azizi; 
  • Ali Akbar Jamali; 
  • Raymond J Spiteri

ABSTRACT

Background:

Background:

During the pandemic, dementia patients were identified as a vulnerable population. Twitter became an important source of information for people seeking updates on COVID-19, and therefore, identifying tweets relevant to dementia can be an important support for dementia patients and their caregivers. However, mining and coding relevant tweets can be daunting due to the sheer volume and high percentage of irrelevant tweets.

Objective:

Objective:

The objective of this study was to automate the identifying tweets relevant to dementia and COVID-19 using natural language processing (NLP) and machine learning (ML) algorithms.

Methods:

We employed a combination of NLP and ML algorithms with manually annotated tweets to identify tweets relevant to dementia and COVID-19. We utilized three datasets containing more than 100,000 tweets and assessed the capability of various ML algorithms in correctly identifying relevant tweets.

Results:

Our results showed that (pre-trained) transfer learning algorithms outperformed traditional ML algorithms in identifying tweets relevant to dementia and COVID-19. Among the algorithms tested, the transfer learning algorithm ALBERT achieved an accuracy of 0.8292 and an AUC of 0.8353. ALBERT substantially outperformed the other algorithms tested, further emphasizing the superior performance of transfer learning algorithms for tweet classification.

Conclusions:

Transfer learning algorithms like ALBERT are highly effective in identifying topic-specific tweets, even when trained with limited or adjacent data, highlighting their superiority over other ML algorithms. Such an automated approach reduces the workload of manual coding of tweets and facilitates their analysis for researchers and policymakers to support dementia patients and their caregivers.


 Citation

Please cite as:

Azizi M, Jamali AA, Spiteri RJ

Identifying X (Formerly Twitter) Posts Relevant to Dementia and COVID-19: Machine Learning Approach

JMIR Form Res 2024;8:e49562

DOI: 10.2196/49562

PMID: 38833288

PMCID: 11185906

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.