Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Public Health and Surveillance

Date Submitted: Mar 12, 2020
Date Accepted: Apr 14, 2020
Date Submitted to PubMed: Apr 15, 2020

The final, peer-reviewed published version of this preprint can be found here:

Data Mining and Content Analysis of the Chinese Social Media Platform Weibo During the Early COVID-19 Outbreak: Retrospective Observational Infoveillance Study

Li J, Xu Q, Cuomo R, Purushothaman V, Mackey T

Data Mining and Content Analysis of the Chinese Social Media Platform Weibo During the Early COVID-19 Outbreak: Retrospective Observational Infoveillance Study

JMIR Public Health Surveill 2020;6(2):e18700

DOI: 10.2196/18700

PMID: 32293582

PMCID: 7175787

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Data Mining and Content Analysis of Chinese Social Media Platform Weibo During Early COVID-19 Outbreak: A Retrospective Observational Infoveillance Study

  • Jiawei Li; 
  • Qing Xu; 
  • Raphael Cuomo; 
  • Vidya Purushothaman; 
  • Tim Mackey

ABSTRACT

Background:

Coronavirus disease 2019 (COVID-19), which originated in Wuhan, China in December 2019, is a rapidly spreading outbreak with over 100,000 cases globally as of early March 2020. Infoveillance approaches using social media can help characterize disease distribution and public knowledge, attitudes, and behaviors during outbreaks.

Objective:

To evaluate the association between number of Chinese social media posts and cases reported in Wuhan City during the early stages of the COVID-19 outbreak.

Methods:

Chinese-language messages from Wuhan were collected for 39 days between December 23, 2019-January 30, 2020 on the Chinese microblogging site Weibo. Total daily cases of COVID-19 in China were obtained from the Chinese National Health Commission. Regression was used to fit a linear model to determine the potential of social media posts to predict the number of cases reported. Qualitative review of social media posts was conducted to determine predominant COVID-19-related user-generated themes.

Results:

115,299 Weibo posts were obtained with an average of 2,956 posts per day (min 0; max 13,587). Regression showed a significant positive relationship between posts and number of reported cases within China and within Hubei province, with approximately 10 more COVID-19 cases per 25 social media posts (p < 0.001) and 10 more cases per 40 social media posts (p < 0.001) respectively. Early outbreak themes were characterized by public uncertainty regarding risks posed by COVID-19, including posts exhibiting protective and higher-risk behavior.

Conclusions:

Results of this study provide initial insight into the origins of the COVID-19 outbreak based upon quantitative and qualitative analysis of Chinese social media data. Future study should continue to explore the utility of social media data to predict COVID-19 disease severity, public reaction, and effectiveness of outbreak communication.


 Citation

Please cite as:

Li J, Xu Q, Cuomo R, Purushothaman V, Mackey T

Data Mining and Content Analysis of the Chinese Social Media Platform Weibo During the Early COVID-19 Outbreak: Retrospective Observational Infoveillance Study

JMIR Public Health Surveill 2020;6(2):e18700

DOI: 10.2196/18700

PMID: 32293582

PMCID: 7175787

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.