JMIR Preprints #8627: Twitter-Based Influenza Detection After Flu Peak via Tweets With Indirect Information: Text Mining Study

Current Preprint Settings

(as selected by the authors)

1. Allow access to the preprint PDF upon submission to:

(a) Open peer-review purposes
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) Nobody

2. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) Nobody

3. When a final paper is published in a JMIR journal, display the preprint as follows:

(a) Allow download
(b) Show abstract only
(c) Do not display anything

4. If the paper is rejected from JMIR journals, display the preprint to:

(a) Logged-in users only
(b) Anybody, anytime
(c) Nobody

Twitter-Based Influenza Detection After Flu Peak via Tweets With Indirect Information: Text Mining Study

Shoko Wakamiya;
Yukiko Kawai;
Eiji Aramaki

ABSTRACT

Background:

The recent rise in popularity and scale of social networking services (SNSs) has resulted in an increasing need for SNS-based information extraction systems. A popular application of SNS data is health surveillance for predicting an outbreak of epidemics by detecting diseases from text messages posted on SNS platforms. Such applications share the following logic: they incorporate SNS users as social sensors. These social sensorâ€“based approaches also share a common problem: SNS-based surveillance are much more reliable if sufficient numbers of users are active, and small or inactive populations produce inconsistent results.

Objective:

This study proposes a novel approach to estimate the trend of patient numbers using indirect information covering both urban areas and rural areas within the posts.

Methods:

We presented a TRAP model by embedding both direct information and indirect information. A collection of tweets spanning 3 years (7 million influenza-related tweets in Japanese) was used to evaluate the model. Both direct information and indirect information that mention other places were used. As indirect information is less reliable (too noisy or too old) than direct information, the indirect information data were not used directly and were considered as inhibiting direct information. For example, when indirect information appeared often, it was considered as signifying that everyone already had a known disease, leading to a small amount of direct information.

Results:

The estimation performance of our approach was evaluated using the correlation coefficient between the number of influenza cases as the gold standard values and the estimated values by the proposed models. The results revealed that the baseline model (BASELINE+NLP) shows .36 and that the proposed model (TRAP+NLP) improved the accuracy (.70, +.34 points).

Conclusions:

The proposed approach by which the indirect information inhibits direct information exhibited improved estimation performance not only in rural cities but also in urban cities, which demonstrated the effectiveness of the proposed method consisting of a TRAP model and natural language processing (NLP) classification.

Citation

Please cite as:

Wakamiya S, Kawai Y, Aramaki E

Twitter-Based Influenza Detection After Flu Peak via Tweets With Indirect Information: Text Mining Study

JMIR Public Health Surveill 2018;4(3):e65

DOI: 10.2196/publichealth.8627

PMID: 30274968

PMCID: 6231889

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Public Health and Surveillance

Date Submitted: Aug 2, 2017

Date Accepted: Jul 18, 2018

(closed for review but you can still tweet)

Twitter-Based Influenza Detection After Flu Peak via Tweets With Indirect Information: Text Mining Study

ABSTRACT

Citation

JMIR Preprints

Accepted for/Published in: JMIR Public Health and Surveillance

Date Submitted: Aug 2, 2017

Date Accepted: Jul 18, 2018

(closed for review but you can still tweet)

Twitter-Based Influenza Detection After Flu Peak via Tweets With Indirect Information: Text Mining Study

ABSTRACT

Citation

Per the author's request the PDF is not available.