Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Mar 7, 2025
Date Accepted: Apr 17, 2025
Identifying Disinformation on the Extended Impacts of COVID-19: A Methodological Investigation Using a Fuzzy Ranking Ensemble of NLP Models
ABSTRACT
Background:
During the COVID-19 pandemic, the continuous spread of misinformation on the internet poses an ongoing threat to public trust and understanding of epidemic prevention policies. Even with the pandemic under control, information regarding the risks of long-term COVID-19 and reinfection still needs to be integrated into COVID-19 policies.
Objective:
The study introduces a deep learning approach combining language models with a fuzzy rank-based ensemble method for detecting misinformation concerning the long-term impacts of COVID-19.
Methods:
The data, comprising 566 genuine and 2361 fake samples, was collected and refined from reliable open sources using data processing techniques. Afterward, deep learning models such as HAN, BERT, and XLNet were trained based on the collected data to detect misinformation about the long-term impacts of COVID-19. This study employed the fuzzy rank-based ensemble technique, combining different deep models to improve the performance further.
Results:
After training on the dataset, various classification methods were evaluated on the test set, including the fuzzy rank-based method and state-of-the-art large language models. The fuzzy rank-based ensemble method, which combines multiple language models, achieved an F1-score of 96.03%.
Conclusions:
The fusion of ensemble learning with PLMs and the Gompertz function, employing fuzzy rank-based methodology, introduces a novel prediction approach with prospects for enhancing accuracy and reliability. Additionally, experimental results imply that training solely on textual content can yield high prediction accuracy.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.