JMIR Preprints #37887: If You’re Happy and You Know It, Answer This Question: A simulation study of the impact of non-random missingness in surveillance data on population-level summaries

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

If You’re Happy and You Know It, Answer This Question: A simulation study of the impact of non-random missingness in surveillance data on population-level summaries

Paul Samuel Weiss;
Lance Allyn Waller

ABSTRACT

Background:

Surveillance data are an essential public health resource for guiding policy and allocation of human and capital resources. These data often consist of large collections of information based on non-random sample designs. Population estimates based on such data may be impacted by the underlying sample distribution compared to the true population of interest. Here we simulate a population of interest and allow response rates to vary in non-random ways to illustrate and measure the effect this has on population-based estimates of an important public health policy outcome.

Objective:

To explore the effects of non-random missing data on surveillance-based population estimates.

Methods:

We simulate a population of respondents answering a survey question about their satisfaction with their community’s policy regarding vaccination mandates for government personnel. We allow response rates to differ between the generally satisfied and dissatisfied and consider the effect of common efforts to control for potential bias: sampling weights, sample size inflation and hypothesis tests for determining missingness at random. We compare these conditions via mean squared errors and sampling variability to characterize the bias in estimation arising under these different approaches.

Results:

Sample estimates present clear, quatifiable bias, even in the most favorable response profile. Efforts to mitigate bias through sample size inflation and sampling weights have negligible effect on the overall result. Additionally, hypothesis testing for departures from random missingness rarely detect the non-random missingness across the widest range of response profiles considered.

Conclusions:

Our results suggest that assuming surveillance data are missing at random during analysis could provide estimates that are widely different from what we might see in the whole population. Policy decisions based on such potentially biased estimates could result in devastating results in terms of community disengagement and health disparities. Alternative approaches to analysis which move away from broad generalization of a mis-measured population at risk are necessary to identify the marginalized groups where overall response may be very different from those observed in the measured respondents.

Citation

Please cite as:

Weiss PS, Waller LA

The Impact of Nonrandom Missingness in Surveillance Data for Population-Level Summaries: Simulation Study

JMIR Public Health Surveill 2022;8(9):e37887

DOI: 10.2196/37887

PMID: 36083618

PMCID: 9508670

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Public Health and Surveillance

Date Submitted: Mar 10, 2022

Date Accepted: Aug 5, 2022

If You’re Happy and You Know It, Answer This Question: A simulation study of the impact of non-random missingness in surveillance data on population-level summaries

ABSTRACT

Citation

Copyright