JMIR Preprints #85161: What’s a survey researcher to do? Applying an epidemiological approach to the detection of fraudulent survey responses

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

What’s a survey researcher to do? Applying an epidemiological approach to the detection of fraudulent survey responses

Rachel Willard-Grace;
Tali Klima;
Mansi Dedhia;
Emily Lo;
Annie Nisnevich;
Allison Gray;
Holly Henry

ABSTRACT

Background:

Survey research has the potential to elevate the experiences and opinions of marginalized populations. The rising number of bot attacks, a method of participant fraud that creates multiple records in survey data using automated software, threatens to drown out those voices and produce inaccurate findings. Rapid identification and mitigation of bot attacks is vital, but there is limited guidance for researchers on scalable approaches to address this problem.

Objective:

Using an epidemiological approach for diagnostic tests, we assessed how well recommended methods detected fraud to develop insight for other web-based survey researchers into how best to identify and shut down bot attacks.

Methods:

We analyzed data from a cross-sectional, web-based statewide survey on access to pediatric subspecialty care in California that used Qualtrics survey software. Caregivers of children with chronic conditions were recruited through Family Resource Centers (FRCs), nonprofit agencies serving families with developmental delays and chronic medical conditions. The survey was sent out to 17 FRCs, whose staff distributed anonymous links to their clients through listservs and flyers. Respondents who completed the survey received a $30 gift card. Prior to launch, we designed a protocol to identify and respond to bot attacks, and we reviewed responses for markers of fraudulent activity. If markers were identified or there was a spike in responses, a senior member of our research team reviewed patterns among all submitted surveys for each FRC to look for signs of bot attacks. We calculated epidemiologic measures of diagnostic test accuracy, such as sensitivity, specificity, positive predictive value, and negative predictive value to better understand the utility of recommended strategies to identify bot attacks.

Results:

We received 646 valid survey records and 905 fraudulent records resulting from bot attacks. The primary indicator of a bot attack was a sudden spike in responses to the survey. Differences in demographics and outcomes, including wait times for pediatric subspecialty care and use of health care services, between the valid and fraudulent data indicated that failure to remove fraudulent records would have dramatically altered the survey results. Most recommended methods in the literature for identifying fraudulent responses had low sensitivity to detect bot attacks and only two were better than chance alone at correctly identifying bot attacks. Combinations of fraud markers and blocks of repeated responses were particularly useful to identify bot attacks.

Conclusions:

Fraudulent data entry using bots has been increasing in survey research. Sharing flexible protocols to identify and mitigate them in a way that is responsive to their ever-changing nature is vital to ensuring that researchers elevate the voices of real people within survey research to inform policy and programmatic discussions.

Citation

Please cite as:

Willard-Grace R, Klima T, Dedhia M, Lo E, Nisnevich A, Gray A, Henry H

Using Epidemiological Test Diagnostics to Select Fraud Detection Methods: Secondary Analysis of Quantitative Cross-Sectional Survey Data

J Med Internet Res 2026;28:e85161

DOI: 10.2196/85161

PMID: 41813240

PMCID: 12978920

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Oct 2, 2025

Open Peer Review Period: Oct 2, 2025 - Nov 27, 2025

Date Accepted: Jan 30, 2026

(closed for review but you can still tweet)

What’s a survey researcher to do? Applying an epidemiological approach to the detection of fraudulent survey responses

ABSTRACT

Citation

Copyright