Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Formative Research

Date Submitted: Nov 9, 2021
Date Accepted: Apr 21, 2022

The final, peer-reviewed published version of this preprint can be found here:

Pretrained Transformer Language Models Versus Pretrained Word Embeddings for the Detection of Accurate Health Information on Arabic Social Media: Comparative Study

Albalawi Y, Nikolov NS, Buckley J

Pretrained Transformer Language Models Versus Pretrained Word Embeddings for the Detection of Accurate Health Information on Arabic Social Media: Comparative Study

JMIR Form Res 2022;6(6):e34834

DOI: 10.2196/34834

PMID: 35767322

PMCID: 9280463

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Pretrained Transformer Language Models vs Pretrained Word Embeddings for the Detection of Accurate Health Information on Arabic Social Media

  • Yahya Albalawi; 
  • Nikola S Nikolov; 
  • Jim Buckley

ABSTRACT

Background:

In recent years, social media has become a major channel for health-related information in Saudi Arabia. While social media makes accurate health-related information easily accessible to many people, it has become a channel for easily spreading health-related misinformation. Prior health informatics studies suggest that a large portion of health-related posts on social media are inaccurate. Given the subject matter and scale of such information, it is important to be able to automatically discriminate between accurate and inaccurate Arabic health-related posts.

Objective:

The first goal of this study is to generate a data set of generic health-related tweets in Arabic, labeled as either accurate or inaccurate health information. The second objective is to leverage this data set for training a state-of-the-art deep learning model for detecting the accuracy of Arabic health-related tweets. In particular, this study aims at training and comparing the performance of multiple deep-learning models that utilize pretrained word embeddings and transformer language models.

Methods:

We used 900 health-related tweets from a previously published data set and applied a pretrained model to extract an additional 900 health-related tweets from a second data set collected specifically for this study. The total of 1800 tweets were labeled by two doctors as “accurate,” “inaccurate,” or “unsure”. The doctors agreed on 779 tweets, which were labeled as either “accurate” or “inaccurate”. Nine variations of pretrained transformer language models were then trained and validated on 623 tweets (80% of the data set) and tested on 156 tweets (20% of the data set). For comparison, we also trained a bidirectional long short-term memory (BLSTM) model with seven different pretrained word embedding as the input layer on the same data set. The models were compared in terms of their accuracy, precision, recall, F1 score, and the macro average of the F1 score.

Results:

We constructed a data set of labeled tweets, 38% of which were labeled inaccurate health information, and 62% of which were labeled accurate health information. We did not include any tweets on which at least one of the annotators was unsure. Of the deep learning models investigated, the AraBERTv0.2 Large model achieved the best overall accuracy (approximately 87.8%), with an F1 score of 87%.

Conclusions:

Our results indicate that the pretrained language model AraBERTv0.2 is the best for classifying tweets as either inaccurate or accurate health information. Future studies should consider applying ensemble learning to combine the best models, since it may produce better results.


 Citation

Please cite as:

Albalawi Y, Nikolov NS, Buckley J

Pretrained Transformer Language Models Versus Pretrained Word Embeddings for the Detection of Accurate Health Information on Arabic Social Media: Comparative Study

JMIR Form Res 2022;6(6):e34834

DOI: 10.2196/34834

PMID: 35767322

PMCID: 9280463

Per the author's request the PDF is not available.