Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Oct 6, 2018
Open Peer Review Period: Oct 6, 2018 - Oct 17, 2018
Date Accepted: Jan 5, 2019
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Data Analysis and Visualization of Newspaper Articles on Thirdhand Smoke: A Topic Modeling Approach

Liu Q, Chen Q, Shen J, Wu H, Sun Y, Ming WK

Data Analysis and Visualization of Newspaper Articles on Thirdhand Smoke: A Topic Modeling Approach

JMIR Med Inform 2019;7(1):e12414

DOI: 10.2196/12414

PMID: 30694199

PMCID: 6371067

Big social data analytics of mass communication related to third-hand smoke: A topic modeling approach

  • Qian Liu; 
  • Qiuyi Chen; 
  • Jiayi Shen; 
  • Huailiang Wu; 
  • Yimeng Sun; 
  • Wai Kit Ming

ABSTRACT

Background:

Third-hand smoke (THS) has been a growing topic for decades in China. Third-hand smoke consists of residual tobacco smoke pollutants that remain on surfaces and in dust and these pollutants are re-emitted into the gas phase or react with oxidants and other compounds in the environment to yield secondary pollutants.

Objective:

Collecting media reports from major media outlets and analyzing this subject using topic modeling can facilitate a better understanding of the role that the media plays in communicating this health concept.

Methods:

The data were retrieved from the Huike and Factiva news databases. A preliminary investigation of the Factiva database, focusing on articles dated between January 1, 2013 and December 31, 2017 and Latent Dirichlet Allocation (LDA) yielded the top 10 topics about the THS. The use of the modified tool LDAvis enables an overall view of the topic model that visualizes different topics as circles; multidimensional scaling is used to represent the inter-topic distances on a two-dimensional plane.

Results:

Seven hundred and forty-five articles were found dated between January 1, 2013 and December 31, 2017. The United States ranked 1st in terms of publications (152 articles on THS from 2013 -2017). We found 279 news reports about THS from the Chinese media over the same period and 363 news reports from the United States. Given our analysis of the percentage of news related to THS in China, topic 1, “cancer,” was the most popular among the topics. This topic was noted in 31.9% of all news stories. Topic 2, “control of quitting smoking,” was related to roughly 15% news on THS.

Conclusions:

Our study showed that (1) related diseases, (2) air and particulate matter (PM2.5), and (3) control and restrictions are the major concerns of the Chinese media reporting on THS.


 Citation

Please cite as:

Liu Q, Chen Q, Shen J, Wu H, Sun Y, Ming WK

Data Analysis and Visualization of Newspaper Articles on Thirdhand Smoke: A Topic Modeling Approach

JMIR Med Inform 2019;7(1):e12414

DOI: 10.2196/12414

PMID: 30694199

PMCID: 6371067

Per the author's request the PDF is not available.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.