Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Mar 10, 2022
Open Peer Review Period: Mar 10, 2022 - May 5, 2022
Date Accepted: Sep 19, 2022
(closed for review but you can still tweet)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Classifying comments on social media related to living kidney donation
ABSTRACT
Background:
Living kidney donation (LKD) currently constitutes approximately a quarter of all kidney transplant donors. There exist barriers that preclude prospective donors from donating such as medical ineligibility and cost associated with donation. A better understanding of the perceptions as well as barriers to living donation can facilitate the development of effective policies, education opportunities, and outreach strategies, which may lead to increased number of LKD. Prior research focused predominantly on the perceptions and barriers experienced by a small subset of individuals who have prior exposure to the donation process. The viewpoints of the general public are rarely represented in prior research.
Objective:
The current study designed a web-scraping method and machine learning algorithms for collecting and classifying comments from a variety of online sources. A resultant dataset was made available to public domain to facilitate further investigation on this topic.
Methods:
We collected comments using web-scraping tools in Python from the New York Times (NYT), as well as YouTube, Twitter, and the forum site Reddit. We developed a set of guidelines for the creation of training data and manual classification of comments as either related to living organ donation or not. We then classified the remaining comments using deep learning.
Results:
203,219 unique comments were collected from the above sources. The deep neural network model resulted in 84% accuracy on testing data. Further validation of predictions found an actual accuracy of 63%. The final database contains 11,027 comments classified as being related to LKD.
Conclusions:
The current study laid out the groundwork for more comprehensive analysis of the perceptions, myths and feelings about LKD. The web-scraping and machine learningclassifier are effective methods to collect and examine opinions on LKD held by the general public.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.