Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Feb 28, 2021
Date Accepted: Apr 18, 2021
Date Submitted to PubMed: Apr 26, 2021

The final, peer-reviewed published version of this preprint can be found here:

Quantifying Online News Media Coverage of the COVID-19 Pandemic: Text Mining Study and Resource

Krawczyk K, Chelkowski T, Laydon DJ, Mishra S, Xifara D, Gibert B, Flaxman S, Mellan T, Schwämmle V, Röttger R, Hadsund JT, Bhatt S

Quantifying Online News Media Coverage of the COVID-19 Pandemic: Text Mining Study and Resource

J Med Internet Res 2021;23(6):e28253

DOI: 10.2196/28253

PMID: 33900934

PMCID: 8174556

Quantifying the online news media coverage of the COVID-19 pandemic: Text Mining Study and Resource

  • Konrad Krawczyk; 
  • Tadeusz Chelkowski; 
  • Daniel J Laydon; 
  • Swapnil Mishra; 
  • Denise Xifara; 
  • Benjamin Gibert; 
  • Seth Flaxman; 
  • Thomas Mellan; 
  • Veit Schwämmle; 
  • Richard Röttger; 
  • Johannes T Hadsund; 
  • Samir Bhatt

ABSTRACT

Background:

Before the advent of an effective vaccine, non-pharmaceutical interventions such as mask wearing, social distancing and lockdown have been the primary measures to combat the COVID-19 pandemic. Such measures are highly effective when there is high population wide adherence, which requires information on current risks posed by the pandemic alongside a clear exposition of the rules and guidelines in place.

Objective:

Here we analyse online news media coverage of COVID-19. We quantify the total volume of COVID-19 articles, their sentiment polarization and leading subtopics, to act as a reference to inform future communication strategies.

Methods:

We collected 26 million news articles from the front pages of 172 major online news sources in 11 countries (available at sciride.org). Using topic detection we identified COVID-19-related content to quantify the proportion of total coverage the pandemic received in 2020. Sentiment analysis tool Vader was employed to stratify the emotional polarity of COVID-19 reporting. Further topic detection and sentiment analysis was performed on COVID-19 coverage to reveal the leading themes in pandemic reporting and their respective emotional polarizations.

Results:

We find that COVID-19 coverage accounted for approximately 25% of all front-page online news articles between January and October 2020. Sentiment analysis of English-speaking sources reveals that overall COVID-19 coverage is not exclusively negatively polarized, suggesting a wide heterogeneous reporting of the pandemic. Within this heterogenous coverage, 16% of COVID-19 news articles (or 4% of all English-speaking articles) can be classified as highly negatively polarized, citing issues such as death, fear or crisis.

Conclusions:

The goal of COVID-19 public health communication is to increase understanding of distancing rules and maximize the impact of any governmental policy. Our results suggest an information overload in COVID-19 reporting that could risk obscuring effective policy communication. Our data and analysis will inform health communication strategies to minimize the risks of COVID-19 while vaccination is rolled out.


 Citation

Please cite as:

Krawczyk K, Chelkowski T, Laydon DJ, Mishra S, Xifara D, Gibert B, Flaxman S, Mellan T, Schwämmle V, Röttger R, Hadsund JT, Bhatt S

Quantifying Online News Media Coverage of the COVID-19 Pandemic: Text Mining Study and Resource

J Med Internet Res 2021;23(6):e28253

DOI: 10.2196/28253

PMID: 33900934

PMCID: 8174556

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.