Long Short-term Memory–Based Prediction of the Spread of Influenza-Like Illness Leveraging Surveillance, Weather, and Twitter Data: Model Development and Validation

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)

The use of Twitter data enhances LSTM - generated prediction of the spread of influenza-like-illness based on surveillance and weather reports

Maria Athanasiou;
Georgios Fragkozidis;
Konstantia Zarkogianni;
Konstantina S. Nikita

ABSTRACT

Background:

The potential of harnessing the plurality of available data in real time along with advanced data analytics towards the accurate prediction of influenza-like-illness (ILI) outbreaks has gained significant scientific interest. Different methodologies based on the use of machine learning techniques and traditional and alternative data sources such as ILI surveillance reports, weather reports, search engine queries, and social media, have been explored with the ultimate goal to be utilized in the development of electronic surveillance systems that could complement existing monitoring resources.

Objective:

The aim of the present study is to investigate for the first time the combined use of ILI surveillance data, weather data, and Twitter data, along with deep learning techniques towards the development of prediction models able to nowcast and forecast weekly ILI cases.

Methods:

The model’s input space consists of information related to weekly ILI surveillance, online social (e.g., Twitter) behavior, and weather conditions. For the design and development of the model, relevant data corresponding to the period 2010-2019 and focusing on the Greek population and weather have been collected. Long short term memory neural networks (LSTMs) are leveraged to efficiently handle the sequential and nonlinear nature of the multitude of collected data. The three data categories are firstly utilized separately for training three LSTM-based primary models. Subsequently, different transfer learning (TL) approaches are explored with the aim of creating various feature spaces combining the features extracted from the corresponding primary models’ LSTM layers in order for the latter to feed a dense layer.

Results:

The primary model which learns from weather data yields better forecast accuracy (root mean square error - RMSE = 0.144, pearson correlation coefficient - PCC= 0.801) than the model which is trained with ILI historical data (RMSE = 0.159, PCC= 0.794). The best performance is achieved by the TL-based model leveraging the combination of the three data categories (RMSE = 0.128, PCC = 0.822).

Conclusions:

The superiority of the TL-based model which takes into account Twitter data, weather data, and ILI surveillance data reflects the potential of alternative public sources to enhance accurate and reliable prediction of the ILI spread.

Citation

Please cite as:

Athanasiou M, Fragkozidis G, Zarkogianni K, Nikita KS

Long Short-term Memory–Based Prediction of the Spread of Influenza-Like Illness Leveraging Surveillance, Weather, and Twitter Data: Model Development and Validation

J Med Internet Res 2023;25:e42519

DOI: 10.2196/42519

PMID: 36745490

PMCID: 9941907

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Sep 6, 2022

Date Accepted: Nov 30, 2022

The use of Twitter data enhances LSTM - generated prediction of the spread of influenza-like-illness based on surveillance and weather reports

ABSTRACT

Citation

Copyright