Accepted for/Published in: JMIR Infodemiology
Date Submitted: Aug 26, 2022
Open Peer Review Period: Aug 26, 2022 - Oct 21, 2022
Date Accepted: Feb 6, 2023
(closed for review but you can still tweet)
Effects of User Profile Attributes on E-cigarette-Related Searches on YouTube: Machine Learning Clustering and Classification
ABSTRACT
Background:
The proliferation of e-cigarette content on YouTube is concerning because of its possible effect on youth use behaviors. YouTube has a personalized search and recommendation algorithm that derives attributes from a user’s profile such as age and gender. However, little is known on whether e-cigarette content is shown differently based on user characteristics.
Objective:
To understand the effect of age and gender of the user profiles on e-cigarette related YouTube search results.
Methods:
We created 16 fictitious YouTube profiles by age 16 and 24 years old, gender (female, male), and ethnicity/race to search for 18 e-cigarette related search terms. We used unsupervised and supervised machine learning (k-means clustering and classification (Graph Convolutional Networks (GCN)) and network theory to characterize the variation in search results of each profile. We further examined whether user attributes may play a role in e-cigarette related content exposure using networks and degree centrality.
Results:
We included 4,201 non-duplicate videos. Our k-means clustering suggests that the videos can be clustered into three categories. The Graph Convolutional Network achieved high accuracy (0.72). Videos are classified based on content into 4 categories: Product Review (49.3%), Health Info (15.1%), Instruction (26.9%), Other (8.5%). Underage users were exposed most to “instruction” videos (37.5%), with some indication that more female 16 year old profiles were more exposed to this content, while older age groups (24 years old) were most exposed to “product review” videos (39.2%).
Conclusions:
Our results indicate that demographic attributes factor into YouTube’s algorithmic systems in the context of e-cigarette-related queries on YouTube. Specifically, differences in age and gender attributes of user profiles do result in variance in both the videos presented in YouTube search results as well as the types of these videos. We find that underage profiles were exposed to e-cigarette content despite YouTube’s age-restriction policy that ostensibly prohibits certain e-cigarette content. Greater enforcement of policies to restrict youth access to e-cigarette content is needed.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.