Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: May 5, 2021
Date Accepted: Mar 7, 2022

The final, peer-reviewed published version of this preprint can be found here:

The Benefits of Crowdsourcing to Seed and Align an Algorithm in an mHealth Intervention for African American and Hispanic Adults: Survey Study

Sehgal NJ, Huang S, Johnson NM, Dickerson J, Jackson D, Baur C

The Benefits of Crowdsourcing to Seed and Align an Algorithm in an mHealth Intervention for African American and Hispanic Adults: Survey Study

J Med Internet Res 2022;24(6):e30216

DOI: 10.2196/30216

PMID: 35727616

PMCID: 9257620

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

The Benefits of Crowdsourcing to Seed and Align an Algorithm in an mHealth Intervention for African American and Hispanic Adults

  • Neil Jay Sehgal; 
  • Shuo Huang; 
  • Neil Mason Johnson; 
  • John Dickerson; 
  • Devlon Jackson; 
  • Cynthia Baur

ABSTRACT

Background:

The lack of publicly available, culturally relevant data sets on African American and bilingual/Spanish-speaking Hispanic adults’ disease prevention and health promotion priorities presents a major challenge to researchers and developers who want to create and test personalized tools for the preventive health behaviors intervention space. Personalization depends on prediction and performance data. To develop such a ‘recommender system’ (RecSys) that predicts the most culturally and personally relevant preventative health information and serve it to African American and Hispanic users of a novel smartphone application while also avoiding the ‘cold start’ problem, we needed population appropriate seed data that aligned with the app’s purposes of setting health goals and finding associated articles and topics in healthfinder.gov, a federally supported database of health conditions and disease prevention information.

Objective:

To address the lack of culturally specific preventive personal health data and sidestep the type of algorithmic bias inherent in a RecSys not trained in the target population, we created a novel dataset on prevention-focused health goals by collecting a large amount of data quickly and at low cost from members of the target population. We seeded our RecSys with data collected anonymously from self-identified Hispanic and self-identified non-Hispanic African American adult respondents utilizing Amazon Mechanical Turk.

Methods:

We developed an online survey in which respondents completed a personal profile, health literacy assessment, family health history, and personal health history. Respondents then selected their top three health goals related to preventable health conditions, and for each goal reviewed and rated the top three healthfinder.gov information returns by importance, personal utility, whether the item should be added to their personal health library, and their satisfaction with the quality of the information returned.

Results:

We collected data from 985 self-identified Hispanic (49%) and self-identified non-Hispanic African American (51%) adult respondents utilizing Amazon Mechanical Turk over only 64 days at a cost of $6.74 per respondent. Respondents rated 92 unique articles. Both African American and Hispanic groups noted physical fitness (62.9%), healthy eating (43.2%), and nutrition and weight (24.0%) as their most frequent personal goals for health. Both African American and Hispanic groups noted mental health issues (34.6%), hypertension (31.0%), and vision or hearing impairments (24.4%) as their most frequently experienced health conditions, and hypertension (55.0%), diabetes (46.1%), and obesity (39.6%) as their most frequent family health conditions, although there are statistically significant differences when considering prevalences of goals, personal health, and family health conditions. Though both groups note experiencing mental health issues more frequently than any other condition, neither respondent group identified mental health as a high priority personal health goal. Respondents’ personal goals align with potentially preventive conditions they report in their family health history.

Conclusions:

Researchers have options, such as Amazon Mechanical Turk, for quick, low-cost means to avoid the ‘cold start’ problem for algorithms and sidestep bias and low relevance for an intended population of app users. Seeding a RecSys with responses from people like the intended users allows the development of a digital health tool that can recommend information to users based on similar demography, health goals, and health history. This approach minimizes potential initial gaps in algorithm performance, allows quicker algorithm refinement in use, and may deliver a better user experience to individuals seeking preventative health information to improve health and achieve health goals. Additionally, this approach allowed investigating the correlation between personal health goals and known health history in a sample of African American and Hispanic participants. Health goals for African American and Hispanic adults are more likely to reflect self-reported somatic health conditions, and less likely to reflect psychological health conditions, even when experiencing mental health issues.


 Citation

Please cite as:

Sehgal NJ, Huang S, Johnson NM, Dickerson J, Jackson D, Baur C

The Benefits of Crowdsourcing to Seed and Align an Algorithm in an mHealth Intervention for African American and Hispanic Adults: Survey Study

J Med Internet Res 2022;24(6):e30216

DOI: 10.2196/30216

PMID: 35727616

PMCID: 9257620

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.