Currently submitted to: Journal of Medical Internet Research
Date Submitted: Mar 23, 2026
Open Peer Review Period: Mar 24, 2026 - May 19, 2026
(currently open for review)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Using Google Places Data to Construct the US Built Environment Retail (UBER) Index and Its Spatial Association with Diabetes Prevalence
ABSTRACT
Background:
National surveillance of commercial retail environments is limited by data sources that are updated infrequently and capture narrow dimensions of food access. Google Places API offers continuously updated, programmatically accessible data on business locations across the US, but its validity as a population-level exposure measure has not been systematically evaluated.
Objective:
To develop and evaluate a Google Places–derived US Built Retail Environment Index (UBER) as a scalable measure of county-level commercial retail infrastructure in the contiguous United States and to estimate its spatial association with age-adjusted diabetes prevalence.
Methods:
We constructed the US Built Environment Retail (UBER) Index using Google Places API data capturing alcohol outlets, fast-food and convenience stores, grocery and health food stores, and fitness and recreation facilities across 1,701 US counties. Principal component analysis generated a single composite index. We assessed Construct validity against USDA Food Access Research Atlas and County Health Rankings benchmarks and estimated spatial associations with age-adjusted diabetes prevalence (CDC PLACES, 2025) using spatial error models adjusting for area deprivation, urbanicity, and census division. Quantile regression and tract-versus-county comparisons evaluated robustness.
Results:
The first principal component explained 92.9% of variance among the four indicators with near-equal loadings (range: 0.492–0.506), indicating the index captures overall commercial retail density rather than any single establishment type. Correlations with existing food-environment benchmarks were weak (|r| < 0.20), confirming that the index measures a distinct dimension of the built environment not reflected in current surveillance tools. Each SD increase in the UBER Index was associated with 0.24 percentage points higher diabetes prevalence (95% CI, 0.16–0.32; P < .001). Associations were stable across quantiles of diabetes prevalence. County- and tract-level analyses showed discordant effect directions, indicating scale-dependent ecological confounding.
Conclusions:
Google Places API data can generate a reliable, spatially structured index of commercial retail infrastructure associated with county-level diabetes prevalence. The UBER index captures a dimension of the built environment not represented in existing surveillance systems and may support scalable digital monitoring of commercial environments relevant to chronic disease prevention.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.