Accepted for/Published in: JMIR mHealth and uHealth
Date Submitted: Mar 16, 2021
Date Accepted: Aug 27, 2021
Understanding predictors of missing location data to inform smartphone study design: an observational study
ABSTRACT
Background:
Smartphone location data can be used for observational health studies (to determine participant exposure or behavior) or to deliver a location-based health intervention. However, missing location data is more common when using smartphones, compared to research-grade location trackers. Missing location data can affect study validity and intervention safety.
Objective:
The objective of the study was to investigate the distribution of missing location data and its predictors to inform design, analysis and interpretation of future smartphone studies.
Methods:
We analyzed hourly smartphone location data collected from 9,665 research participants on 448,400 participant–days in a national smartphone study investigating the association between weather and pain (Cloudy with a Chance of Pain). We used a generalized mixed-effects linear-model with logistic regression to identify whether a successfully recorded geolocation was associated with time of day, time in study, operating system, time since previous survey completion, participant age, sex and weather sensitivity.
Results:
For most participants, the app collected a median of 2 out of a maximum of 24 locations (18% of participants), no location data (0/24; 17%) or complete data (24/24; 16%). The median locations per day differed by operating system: participants with an Android phone most often had complete data (a median of 24/24 locations), whereas iPhone users most often had a median of 2/24 locations. The odds of a successfully recorded location were higher for Android phones compared to iPhones: the odds of recording a location were 22.91 times higher (95% confidence interval: 19.53–26.87). The odds of a successfully recorded location was lower during weekends (OR 0.94; 95% CI 0.94-0.95) and nights (OR 0.37; 95% CI 0.37-0.38), if time in study was longer (OR 0.99 per additional day in study; 95%CI 0.99-1.00), and if a participant had not used the app recently (OR 0.96 per additional day since last survey entry; 95% CI 0.96-0.96). Participant age and sex did not predict missing location data.
Conclusions:
These predictors of missing location data could inform app settings and user instructions of future smartphone studies. They have implications for analysis methods to deal with missing location data, such as imputation of missing values or case only analysis. Health studies using smartphones for data collection should assess context-specific consequences of high missing data, especially among iPhone users, during the night and for disengaged participants.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.