JMIR Preprints #73601: Identifying Disinformation on the Extended Impacts of COVID-19: A Methodological Investigation Using a Fuzzy Ranking Ensemble of NLP Models

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Identifying Disinformation on the Extended Impacts of COVID-19: A Methodological Investigation Using a Fuzzy Ranking Ensemble of NLP Models

Jian-An Chen;
Wu-Chun Chung;
Che-Lun Hung;
Chun-Ying Wu

ABSTRACT

Background:

During the COVID-19 pandemic, the continuous spread of misinformation on the internet poses an ongoing threat to public trust and understanding of epidemic prevention policies. Even with the pandemic under control, information regarding the risks of long-term COVID-19 and reinfection still needs to be integrated into COVID-19 policies.

Objective:

The study introduces a deep learning approach combining language models with a fuzzy rank-based ensemble method for detecting misinformation concerning the long-term impacts of COVID-19.

Methods:

The data, comprising 566 genuine and 2361 fake samples, was collected and refined from reliable open sources using data processing techniques. Afterward, deep learning models such as HAN, BERT, and XLNet were trained based on the collected data to detect misinformation about the long-term impacts of COVID-19. This study employed the fuzzy rank-based ensemble technique, combining different deep models to improve the performance further.

Results:

After training on the dataset, various classification methods were evaluated on the test set, including the fuzzy rank-based method and state-of-the-art large language models. The fuzzy rank-based ensemble method, which combines multiple language models, achieved an F1-score of 96.03%.

Conclusions:

The fusion of ensemble learning with PLMs and the Gompertz function, employing fuzzy rank-based methodology, introduces a novel prediction approach with prospects for enhancing accuracy and reliability. Additionally, experimental results imply that training solely on textual content can yield high prediction accuracy.

Citation

Please cite as:

Chen JA, Chung WC, Hung CL, Wu CY

Identifying Disinformation on the Extended Impacts of COVID-19: Methodological Investigation Using a Fuzzy Ranking Ensemble of Natural Language Processing Models

J Med Internet Res 2025;27:e73601

DOI: 10.2196/73601

PMID: 40397945

PMCID: 12138316

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Mar 7, 2025

Date Accepted: Apr 17, 2025

Identifying Disinformation on the Extended Impacts of COVID-19: A Methodological Investigation Using a Fuzzy Ranking Ensemble of NLP Models

ABSTRACT

Citation

Copyright