Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Public Health and Surveillance

Date Submitted: Nov 18, 2024
Open Peer Review Period: Nov 18, 2024 - Jan 13, 2025
Date Accepted: Jun 6, 2025
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Machine Learning Applications in Population and Public Health: Guidelines for Development, Testing, and Implementation

Pinto AD, Birdi S, Durant S, Rabet R, Parekh R, Ali S, Buckeridge D, Ghassemi M, Gibson J, John-Baptiste A, Macklin J, McCradden MD, McKenzie K, Naraei P, Owusu-Bempah A, Rosella LC, Shaw J, Upshur R, Mishra S

Machine Learning Applications in Population and Public Health: Guidelines for Development, Testing, and Implementation

JMIR Public Health Surveill 2025;11:e68952

DOI: 10.2196/68952

PMID: 41134979

PMCID: 12551935

Machine learning applications in population and public health: Guidelines for development, testing and implementation

  • Andrew D. Pinto; 
  • Sharon Birdi; 
  • Steve Durant; 
  • Roxana Rabet; 
  • Rahul Parekh; 
  • Shehzad Ali; 
  • David Buckeridge; 
  • Marzyeh Ghassemi; 
  • Jennifer Gibson; 
  • Ava John-Baptiste; 
  • Jillian Macklin; 
  • Melissa D. McCradden; 
  • Kwame McKenzie; 
  • Parisa Naraei; 
  • Akwasi Owusu-Bempah; 
  • Laura C. Rosella; 
  • James Shaw; 
  • Ross Upshur; 
  • Sharmistha Mishra

ABSTRACT

Background:

Machine learning (ML), a subset of artificial intelligence, uses large datasets to identify patterns between potential predictors and outcomes. ML involves iterative learning from data and is increasingly used in population and public health for early warning of infectious diseases, predicting non-communicable diseases, and assessing public health interventions. In addition to predictive modeling, ML is also utilized for clustering and causal inference, offering broader applications for analyzing public health data. However, ML applications can inadvertently amplify biases related to the social determinants of health.

Objective:

Specific guidelines for using ML in population and public health have not yet been created. This study aimed to develop recommendations for the ethical and effective application of ML in this field.

Methods:

A diverse team of experts in computer science, statistical modeling, clinical and population health epidemiology, health economics, ethics, sociology, and public health was assembled. Using a combination of comprehensive literature reviews and a modified Delphi process, the team identified and refined key recommendations.

Results:

Five key recommendations were developed: (1) prioritize partnerships and interventions to support structurally disadvantaged communities; (2) use ML for dynamic situations like public health emergencies while adhering to ethical standards; (3) conduct risk assessments and bias mitigation strategies aligned with identified risks; (4) ensure technical transparency and reproducibility by publicly sharing data sources and methodologies; (5) foster multidisciplinary dialogue to discuss potential harms of ML-related bias and raise awareness among the public and public health experts.

Conclusions:

The proposed guidelines provide clear, operational steps for stakeholders, ensuring that ML tools are not only effective but also ethically grounded and feasible in real-world scenarios.


 Citation

Please cite as:

Pinto AD, Birdi S, Durant S, Rabet R, Parekh R, Ali S, Buckeridge D, Ghassemi M, Gibson J, John-Baptiste A, Macklin J, McCradden MD, McKenzie K, Naraei P, Owusu-Bempah A, Rosella LC, Shaw J, Upshur R, Mishra S

Machine Learning Applications in Population and Public Health: Guidelines for Development, Testing, and Implementation

JMIR Public Health Surveill 2025;11:e68952

DOI: 10.2196/68952

PMID: 41134979

PMCID: 12551935

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.