Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Apr 20, 2020
Date Accepted: Jul 19, 2020
Date Submitted to PubMed: Jul 21, 2020

The final, peer-reviewed published version of this preprint can be found here:

Regional Infoveillance of COVID-19 Case Rates: Analysis of Search-Engine Query Patterns

Cousins H, Cousins C, Harris A, Pasquale L

Regional Infoveillance of COVID-19 Case Rates: Analysis of Search-Engine Query Patterns

J Med Internet Res 2020;22(7):e19483

DOI: 10.2196/19483

PMID: 32692691

PMCID: 7394521

Regional Infoveillance of COVID-19 Case Rates: Analysis of Search-Engine Query Patterns

  • Henry Cousins; 
  • Clara Cousins; 
  • Alon Harris; 
  • Louis Pasquale

ABSTRACT

Background:

Timely allocation of medical resources for COVID-19 requires early detection of regional outbreaks. Internet browsing data, such as search activity levels, may provide predictive ability for estimating cases in a local population that are yet to be confirmed.

Objective:

The objective of our study was to determine whether search-engine query patterns can forecast COVID-19 case rates at the state and local levels in the United States.

Methods:

We used regional confirmed case data from the New York Times and Google Trends results from 50 states and 203 county-based designated market areas (DMA). We identified search terms whose activity precedes and correlates with confirmed case rates at the national level, using univariate regression to construct a composite explanatory variable based on top-scoring search queries offset by temporal lags. We measured the correlation of the explanatory variable with out-of-sample case rate data at the state and DMA level.

Results:

Forecasts were highly correlated with confirmed case rates at the state and local level, using search data available up to 10 days in advance of confirmed case rates. They predicted case activity in 49 of 50 states and in 128 of 203 DMA at a significance level of .05 and were robust to differences in regional location, population, and date of outbreak.

Conclusions:

Identifiable patterns in search query activity may be used to forecast emerging regional outbreaks of COVID-19.


 Citation

Please cite as:

Cousins H, Cousins C, Harris A, Pasquale L

Regional Infoveillance of COVID-19 Case Rates: Analysis of Search-Engine Query Patterns

J Med Internet Res 2020;22(7):e19483

DOI: 10.2196/19483

PMID: 32692691

PMCID: 7394521

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.