Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Formative Research

Date Submitted: May 15, 2023
Open Peer Review Period: May 15, 2023 - Jul 10, 2023
Date Accepted: Dec 29, 2023
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Use of Machine Learning Tools in Evidence Synthesis of Tobacco Use Among Sexual and Gender Diverse Populations: Algorithm Development and Validation

Ma S, Jiang S, Yang O, Zhang X, Fu Y, Zhang Y, Kaareen A, Ling M, Chen J, Shang C

Use of Machine Learning Tools in Evidence Synthesis of Tobacco Use Among Sexual and Gender Diverse Populations: Algorithm Development and Validation

JMIR Form Res 2024;8:e49031

DOI: 10.2196/49031

PMID: 38265858

PMCID: 10851114

Use of Machine Learning Tools in Evidence Synthesis of Tobacco Use among Sexual and Gender Diverse Populations

  • Shaoying Ma; 
  • Shuning Jiang; 
  • Olivia Yang; 
  • Xuanzhi Zhang; 
  • Yu Fu; 
  • Yusen Zhang; 
  • Aadeeba Kaareen; 
  • Meng Ling; 
  • Jian Chen; 
  • Ce Shang

ABSTRACT

Background:

LGBTQ+ youth and adults use tobacco at a higher rate than the national average in the US. There is an urgent need for synthesizing published evidence to inform tobacco control policies to better serve this priority population.

Objective:

To develop algorithms to curate peer-reviewed articles that study LGBTQ individuals’ tobacco use and that are published at leading tobacco research journals from 2015 to early 2021, and to extract domain-specific textual entities from these articles.

Methods:

Our team built a tobacco research domain-specific semantic database to identify and extract data from articles that studied the LGBTQ+ population. We trained and employed a language model to extract named entities after learning patterns and relationships between words and their context in text.

Results:

Among 2,993 paper abstracts, 33 were identified as relevant to LGBTQ individuals’ tobacco use. We extracted the following information: different groups being studied, within the LGBTQ+ population; geographic locations; product types and characteristics; analytical methods; behavioral outcomes; and policies or interventions.

Conclusions:

Evidence on the impacts of tobacco control policies on the LGBTQ+ population was lacking among the articles from leading tobacco research journals, and our tools have scale-up potentials to be applied to broader LGBTQ+ health literature.


 Citation

Please cite as:

Ma S, Jiang S, Yang O, Zhang X, Fu Y, Zhang Y, Kaareen A, Ling M, Chen J, Shang C

Use of Machine Learning Tools in Evidence Synthesis of Tobacco Use Among Sexual and Gender Diverse Populations: Algorithm Development and Validation

JMIR Form Res 2024;8:e49031

DOI: 10.2196/49031

PMID: 38265858

PMCID: 10851114

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.