Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Feb 22, 2019
Date Accepted: May 27, 2019
Detection, Evaluation, and Characterization of Illicit Digital Drug Dealers on Instagram using Machine Learning Approaches
ABSTRACT
Background:
Social media use is now ubiquitous, but growth in social media communications has also made it a convenient digital platform for drug dealers selling controlled substances, opioids, and other illicit drugs. Previous studies and news investigations have reported use of popular social media platforms as conduits for opioid sales. This study uses deep learning to detect illicit drug dealing on the image and video sharing platform Instagram.
Objective:
The aim of this study was to develop and evaluate a machine learning approach to detect Instagram posts related to illegal online drug dealing.
Methods:
In this paper, we describe an approach to detect drug dealers by using a deep learning model on Instagram. We collected Instagram posts using a web scrapper through July 2018 to October 2018 and then compare our deep learning model against three different machine learning models (e.g. Random Forest, Decision Tree, and Support Vector Machine) to assess the performance and accuracy of the model. For our deep learning model we used the Long short-term memory (LSTM) unit in the Recurrent Neural Network (RNN) to learn the pattern of the text of drug dealing posts. We also manually annotated all posts collected in order to evaluate our model performance and to characterize drug selling conversations.
Results:
From the 12,857 posts we collected, we detected 1,228 drug dealers, comprised of 267 users. We used cross-validation to evaluate the four models, with our deep learning model reached 95% on F1-score performing better than the other three models. We also found that by removing the hashtags in the text, the model performance was highest. Detected posts contained hashtags related to the several drugs, including controlled substance Xanax (87.8%, n=1078), Oxycodone/Oxycontin (26.1%, n=321), and illicit drugs LSD (17.3%, n=213) and MDMA (7.6%, n=94). We also observed the use of communication applications for suspected drug trading through user comments.
Conclusions:
Our approach using a combination of web scraping and deep learning was able to detect illegal online drug sellers on Instagram with high accuracy. Despite increased scrutiny by regulators and policymakers, the Instagram platform continues to host posts from drug dealers, in violation of federal law. Further action needs to be taken to ensure the safety of social media communities and help put an end to this illicit digital channel of sourcing. Clinical Trial: not applicable
Citation