JMIR Preprints #12876: Health social network analytics: utilizing social media to detect the outcome of chronic diseases

Current Preprint Settings

(as selected by the authors)

1. Allow access to the preprint PDF upon submission to:

(a) Open peer-review purposes
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) Nobody

2. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) Nobody

3. When a final paper is published in a JMIR journal, display the preprint as follows:

(a) Allow download
(b) Show abstract only
(c) Do not display anything

4. If the paper is rejected from JMIR journals, display the preprint to:

(a) Logged-in users only
(b) Anybody, anytime
(c) Nobody

Health social network analytics: utilizing social media to detect the outcome of chronic diseases

Vasiliki Foufi;
Tatsawan Timakum;
Christophe Gaudet-Blavignac;
Christian Lovis;
Min Song

ABSTRACT

Background:

Social media constitutes a valuable resource for text mining tasks. In the healthcare domain, multiple forums and blogs have been created where people share their personal experience and seek for other people’s knowledge and advice.

Objective:

The work presented in this paper reports a study of entities related to chronic diseases and their relationships in a user-generated content on social media. The major focus of our study is on understanding the characteristics of disease entities and their relations from the user’s perspective.

Methods:

We collected a corpus of 17,624 text posts from disease-specific subreddits of the internet community Reddit.com. For entity and relation extraction from these data, we employed the PKDE4J tool, a text mining system that integrates dictionary-based entity extraction and rule-based relation extraction in a highly flexible and extensible framework.

Results:

Using PKDE4J, we extracted two types of entities and relations: biomedical entities and relationships, and subject-predicate-object entity relationships. In total, 82,138 entities and 30,341 relation pairs were extracted from the Reddit dataset.

Conclusions:

This study paves the way for making user-generated content on health-oriented social media available to scientists working on the development of patient treatments. These data may not be available in the literature or from laboratory experiments. The results reported in this paper are promising, and indicate the need for more in-depth studies on the best way to respond to users’ medical needs and concerns as expressed on social media.

Citation

Please cite as:

Foufi V, Timakum T, Gaudet-Blavignac C, Lovis C, Song M

Mining of Textual Health Information from Reddit: Analysis of Chronic Diseases With Extracted Entities and Their Relations

J Med Internet Res 2019;21(6):e12876

DOI: 10.2196/12876

PMID: 31199327

PMCID: 6595941

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Nov 22, 2018

Date Accepted: May 21, 2019

(closed for review but you can still tweet)

Health social network analytics: utilizing social media to detect the outcome of chronic diseases

ABSTRACT

Citation

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Nov 22, 2018

Date Accepted: May 21, 2019

(closed for review but you can still tweet)

Health social network analytics: utilizing social media to detect the outcome of chronic diseases

ABSTRACT

Citation

Per the author's request the PDF is not available.