Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Apr 4, 2020
Open Peer Review Period: Apr 3, 2020 - Apr 9, 2020
Date Accepted: Apr 16, 2020
Date Submitted to PubMed: Apr 17, 2020
(closed for review but you can still tweet)
Health Communication Through News Media During the Early Stage of the COVID-19 Outbreak in China: A Digital Topic Modeling Approach
ABSTRACT
Background:
In December 2019, a few COVID-19 cases were first reported in Wuhan, Hubei, China. Soon after, increasing cases were detected in other parts of China and soon the disease broke out in China. As this dreadful disease spreads rapidly, the mass media has been active in community education on COVID-19 by delivering health information about this novel coronavirus, such as its pathogenesis, spread, and prevention/containment.
Objective:
This study collected media reports on COVID-19 and investigated the patterns of media-directed health communications as well as the role of media in this ongoing COVID-19 crisis in China.
Methods:
We adopted the Huike database to extract related news articles about coronavirus from major press media, between January 1st, 2020, to February 20th, 2020. We then sorted and analyzed the data using Python software and Python package Jieba. We sought a suitable topic number with evidence of the coherence number. We operated Latent Dirichlet Allocation (LDA) topic modeling with the suitable topic number and generated corresponding keywords and topic names. We then divided these topics into different themes by plotting them into two-dimensional plane via multidimensional scaling.
Results:
After removing duplicates, our search identified 7791 relevant news reports. We listed the number of articles published per day. According to the coherence value, we chose 20 as our number of topics, and the generated topics’ themes and keywords. These topics were categorized into nine main primary themes based on the topic visualization figure. The top three popular themes were prevention and control procedures, medical treatment and research, global/local social/economic influences, accounting for 32.6%, 16.6%, 11.8% of the collected reports respectively.
Conclusions:
Topic modeling of news articles can produce useful information about the significance of mass media for early health communication. Comparing the number of articles each day and the outbreak development, we note that mass media news reports in China lag behind the development of COVID-19. The major themes accounted for around half the content and tended to focus on the larger society than on individuals. The COVID-19 crisis has become a global issue, and society has also become concerned about donation and support as well as mental health. We recommend that future work should address the mass media’s actual impact on readers during the COVID-19 crisis through sentiment analysis of news data.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.