Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Feb 22, 2019
Date Accepted: May 27, 2019

The final, peer-reviewed published version of this preprint can be found here:

A Machine Learning Approach for the Detection and Characterization of Illicit Drug Dealers on Instagram: Model Evaluation Study

Li J, Xu Q, Shah N, Mackey T

A Machine Learning Approach for the Detection and Characterization of Illicit Drug Dealers on Instagram: Model Evaluation Study

J Med Internet Res 2019;21(6):e13803

DOI: 10.2196/13803

PMID: 31199298

PMCID: 6598421

Detection, Evaluation, and Characterization of Illicit Digital Drug Dealers on Instagram using Machine Learning Approaches

  • Jiawei Li; 
  • Qing Xu; 
  • Neal Shah; 
  • Tim Mackey

ABSTRACT

Background:

Social media use is now ubiquitous, but growth in social media communications has also made it a convenient digital platform for drug dealers selling controlled substances, opioids, and other illicit drugs. Previous studies and news investigations have reported use of popular social media platforms as conduits for opioid sales. This study uses deep learning to detect illicit drug dealing on the image and video sharing platform Instagram.

Objective:

The aim of this study was to develop and evaluate a machine learning approach to detect Instagram posts related to illegal online drug dealing.

Methods:

In this paper, we describe an approach to detect drug dealers by using a deep learning model on Instagram. We collected Instagram posts using a web scrapper through July 2018 to October 2018 and then compare our deep learning model against three different machine learning models (e.g. Random Forest, Decision Tree, and Support Vector Machine) to assess the performance and accuracy of the model. For our deep learning model we used the Long short-term memory (LSTM) unit in the Recurrent Neural Network (RNN) to learn the pattern of the text of drug dealing posts. We also manually annotated all posts collected in order to evaluate our model performance and to characterize drug selling conversations.

Results:

From the 12,857 posts we collected, we detected 1,228 drug dealers, comprised of 267 users. We used cross-validation to evaluate the four models, with our deep learning model reached 95% on F1-score performing better than the other three models. We also found that by removing the hashtags in the text, the model performance was highest. Detected posts contained hashtags related to the several drugs, including controlled substance Xanax (87.8%, n=1078), Oxycodone/Oxycontin (26.1%, n=321), and illicit drugs LSD (17.3%, n=213) and MDMA (7.6%, n=94). We also observed the use of communication applications for suspected drug trading through user comments.

Conclusions:

Our approach using a combination of web scraping and deep learning was able to detect illegal online drug sellers on Instagram with high accuracy. Despite increased scrutiny by regulators and policymakers, the Instagram platform continues to host posts from drug dealers, in violation of federal law. Further action needs to be taken to ensure the safety of social media communities and help put an end to this illicit digital channel of sourcing. Clinical Trial: not applicable


 Citation

Please cite as:

Li J, Xu Q, Shah N, Mackey T

A Machine Learning Approach for the Detection and Characterization of Illicit Drug Dealers on Instagram: Model Evaluation Study

J Med Internet Res 2019;21(6):e13803

DOI: 10.2196/13803

PMID: 31199298

PMCID: 6598421

Per the author's request the PDF is not available.