Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Aug 13, 2023
Date Accepted: Oct 4, 2024
Google Trends data differently reflect demographic and clinical subgroups: A infodemiology case study on asthma hospitalizations
ABSTRACT
Background:
Google Trends (GT) data have shown promising results as a complementary tool to classical surveillance approaches. However, GT data are not necessarily provided by a representative sample of patients, and may be skewed towards demographic and clinical groups that are more likely to use the internet to search for their health.
Objective:
In this study, we aimed to assess whether GT-based models may display a different performance in distinct population subgroups. To assess that, we analysed a case study on asthma hospitalizations.
Methods:
We analysed all hospitalizations with a main diagnosis of asthma occurring in three different countries (Portugal, Spain, and Brazil) for a period of approximately five years (January 1, 2012-December 17, 2016). Data on web-based searches on common cold for the same countries and time period were retrieved from GT. We estimated the correlation between GT data and the weekly occurrence of asthma hospitalizations (considering separate asthma admissions data according to patients’ age, sex, ethnicity and presence of comorbidities). In addition, we built autoregressive models to forecast the weekly number of asthma hospitalizations (for the different aforementioned subgroups) for a period of 1 year (June 2015-June 2016) based on admissions and GT data from the 3 previous years.
Results:
Overall, correlation coefficients between GT on the pseudo-influenza syndrome topic and asthma hospitalizations ranged between 0.33 (in Portugal for admissions with at least one Charlson comorbidity group) and 0.86 (for admissions in females and in Whites in Brazil). In the three assessed countries, forecasted hospitalizations for 2015-2016 correlated more strongly with observed admissions of older versus younger individuals (Portugal: Spearman (ρ)= 0.70 vs ρ= 0.56; Spain: ρ=0.88 vs ρ=0.76; Brazil: ρ=0.83 vs ρ=0.82). In Portugal and Spain, forecasted hospitalizations displayed a stronger correlation with admissions occurring in females than in males (Portugal: ρ=0.75 vs ρ=0.52; Spain: ρ=0.83 vs ρ=0.51) individuals. In Brazil, stronger correlations were observed for admissions of White compared with Black or Brown individuals (ρ=0.92 vs ρ=0.87). In Portugal, stronger correlations were observed for admissions of individuals without any comorbidity compared with admissions of individuals with comorbidities (ρ=0.68 vs ρ=0.66).
Conclusions:
We observed that the models based on the GT may perform differently in demographic and clinical subgroups of participants, possibly reflecting differences in the composition of internet users health-seeking behaviours.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.