Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Infodemiology

Date Submitted: Aug 26, 2022
Open Peer Review Period: Aug 26, 2022 - Oct 21, 2022
Date Accepted: Feb 6, 2023
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Influence of User Profile Attributes on e-Cigarette–Related Searches on YouTube: Machine Learning Clustering and Classification

Murthy D, Lee J, Dashtian H, Kong G

Influence of User Profile Attributes on e-Cigarette–Related Searches on YouTube: Machine Learning Clustering and Classification

JMIR Infodemiology 2023;3:e42218

DOI: 10.2196/42218

PMID: 37124246

PMCID: 10139687

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Effects of User Profile Attributes on E-cigarette-Related Searches on YouTube: Machine Learning Clustering and Classification

  • Dhiraj Murthy; 
  • Juhan Lee; 
  • Hassan Dashtian; 
  • Grace Kong

ABSTRACT

Background:

The proliferation of e-cigarette content on YouTube is concerning because of its possible effect on youth use behaviors. YouTube has a personalized search and recommendation algorithm that derives attributes from a user’s profile such as age and gender. However, little is known on whether e-cigarette content is shown differently based on user characteristics.

Objective:

To understand the effect of age and gender of the user profiles on e-cigarette related YouTube search results.

Methods:

We created 16 fictitious YouTube profiles by age 16 and 24 years old, gender (female, male), and ethnicity/race to search for 18 e-cigarette related search terms. We used unsupervised and supervised machine learning (k-means clustering and classification (Graph Convolutional Networks (GCN)) and network theory to characterize the variation in search results of each profile. We further examined whether user attributes may play a role in e-cigarette related content exposure using networks and degree centrality.

Results:

We included 4,201 non-duplicate videos. Our k-means clustering suggests that the videos can be clustered into three categories. The Graph Convolutional Network achieved high accuracy (0.72). Videos are classified based on content into 4 categories: Product Review (49.3%), Health Info (15.1%), Instruction (26.9%), Other (8.5%). Underage users were exposed most to “instruction” videos (37.5%), with some indication that more female 16 year old profiles were more exposed to this content, while older age groups (24 years old) were most exposed to “product review” videos (39.2%).

Conclusions:

Our results indicate that demographic attributes factor into YouTube’s algorithmic systems in the context of e-cigarette-related queries on YouTube. Specifically, differences in age and gender attributes of user profiles do result in variance in both the videos presented in YouTube search results as well as the types of these videos. We find that underage profiles were exposed to e-cigarette content despite YouTube’s age-restriction policy that ostensibly prohibits certain e-cigarette content. Greater enforcement of policies to restrict youth access to e-cigarette content is needed.


 Citation

Please cite as:

Murthy D, Lee J, Dashtian H, Kong G

Influence of User Profile Attributes on e-Cigarette–Related Searches on YouTube: Machine Learning Clustering and Classification

JMIR Infodemiology 2023;3:e42218

DOI: 10.2196/42218

PMID: 37124246

PMCID: 10139687

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.