Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Mar 20, 2018
Open Peer Review Period: Mar 21, 2018 - Aug 17, 2018
Date Accepted: Dec 10, 2018
(closed for review but you can still tweet)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Causal Relationships Among Pollen Counts, Tweet Numbers, and Patient Numbers for Seasonal Allergic Rhinitis Surveillance: Retrospective Analysis
Background:
Health-related social media data are increasingly used in disease-surveillance studies, which have demonstrated moderately high correlations between the number of social media posts and the number of patients. However, there is a need to understand the causal relationship between the behavior of social media users and the actual number of patients in order to increase the credibility of disease surveillance based on social media data.
Objective:
This study aimed to clarify the causal relationships among pollen count, the posting behavior of social media users, and the number of patients with seasonal allergic rhinitis in the real world.
Methods:
This analysis was conducted using datasets of pollen counts, tweet numbers, and numbers of patients with seasonal allergic rhinitis from Kanagawa Prefecture, Japan. We examined daily pollen counts for Japanese cedar (the major cause of seasonal allergic rhinitis in Japan) and hinoki cypress (which commonly complicates seasonal allergic rhinitis) from February 1 to May 31, 2017. The daily numbers of tweets that included the keyword “kafunshō” (or seasonal allergic rhinitis) were calculated between January 1 and May 31, 2017. Daily numbers of patients with seasonal allergic rhinitis from January 1 to May 31, 2017, were obtained from three healthcare institutes that participated in the study. The Granger causality test was used to examine the causal relationships among pollen count, tweet numbers, and the number of patients with seasonal allergic rhinitis from February to May 2017. To determine if time-variant factors affect these causal relationships, we analyzed the main seasonal allergic rhinitis phase (February to April) when Japanese cedar trees actively produce and release pollen.
Results:
Increases in pollen count were found to increase the number of tweets during the overall study period (P=.04), but not the main seasonal allergic rhinitis phase (P=.05). In contrast, increases in pollen count were found to increase patient numbers in both the study period (P=.04) and the main seasonal allergic rhinitis phase (P=.01). Increases in the number of tweets increased the patient numbers during the main seasonal allergic rhinitis phase (P=.02), but not the overall study period (P=.89). Patient numbers did not affect the number of tweets in both the overall study period (P=.24) and the main seasonal allergic rhinitis phase (P=.47).
Conclusions:
Understanding the causal relationships among pollen counts, tweet numbers, and numbers of patients with seasonal allergic rhinitis is an important step to increasing the credibility of surveillance systems that use social media data. Further in-depth studies are needed to identify the determinants of social media posts described in this exploratory analysis.
Citation
Per the author's request the PDF is not available.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.