Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Public Health and Surveillance

Date Submitted: Oct 14, 2021
Date Accepted: May 27, 2022

The final, peer-reviewed published version of this preprint can be found here:

Using Social Media to Predict Food Deserts in the United States: Infodemiology Study of Tweets

Sigalo N, St. Jean B, Frias-Martinez V

Using Social Media to Predict Food Deserts in the United States: Infodemiology Study of Tweets

JMIR Public Health Surveill 2022;8(7):e34285

DOI: 10.2196/34285

PMID: 35788108

PMCID: 9297137

Using Social Media to Predict Food Deserts in the United States: Infodemiology Study of Tweets

  • Nekabari Sigalo; 
  • Beth St. Jean; 
  • Vanessa Frias-Martinez

ABSTRACT

Background:

The issue of food insecurity is becoming increasingly important to public health practitioners because of the adverse health outcomes and underlying racial disparities that are associated with insufficient access to healthy foods. Prior research has used data sources such as surveys, geographic information systems, and food store assessments to identify regions classified as food deserts, but perhaps the individuals in these regions unknowingly provide their own accounts of food consumption and food insecurity, via social media. Social media data have proved useful in answering questions related to public health, so it may prove to be a rich data source for identifying food deserts in the United States.

Objective:

The aim of this study was to develop, from geotagged Twitter data, a predictive model for the identification of food deserts in the United States, using the linguistic constructs found in food-related tweets.

Methods:

Twitter’s streaming application programming interface was used to collect a random 1% sample of public, geolocated tweets across 25 major cities, from March 2020 to December 2020. A total of 60,174 geolocated, food-related tweets were collected across the 25 cities. Each geolocated tweet was mapped to its respective census tract using point-to-polygon mapping, which allowed us to develop census-tract level features derived from the linguistic constructs found in food-related tweets, such as tweet sentiment and average nutritional value of foods mentioned in tweets. These features were then used to examine the associations between food desert status and the food-ingestion language and sentiment of tweets in a census tract, and to determine whether food-related tweets can be used to infer census tract-level food desert status.

Results:

We found associations between a census tract being classified as a food desert and an increase in the number of tweets in a census tract that mentioned unhealthy foods (P=.03), including foods high in cholesterol (P=.02) or lower in key nutrients, such as potassium (P=.01). We also found an association between a census tract being classified as a food desert and an increase the proportion of tweets that mentioned healthy foods (P=.03) and fast-food restaurants (P=.01), with positive sentiment. We also found that including food ingestion language derived from tweets in classification models that predict food desert status improves model performance when compared to baseline models that only include socio-economic characteristics.

Conclusions:

Social media data has been increasingly used to answer questions related to health and well-being. Using Twitter data, we found that food-related tweets can be used to develop models for predicting census tract food desert status, with high accuracy, and improves over baseline models. Food-ingestion language found in tweets, such as census-tract level measures of food sentiment and healthiness, are associated with census tract-level food desert status.


 Citation

Please cite as:

Sigalo N, St. Jean B, Frias-Martinez V

Using Social Media to Predict Food Deserts in the United States: Infodemiology Study of Tweets

JMIR Public Health Surveill 2022;8(7):e34285

DOI: 10.2196/34285

PMID: 35788108

PMCID: 9297137

Per the author's request the PDF is not available.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.