JMIR Preprints #24473: Predicting cardiovascular risk using data from social media

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Predicting cardiovascular risk using data from social media

Anietie Andy;
Sharath Guntuku;
Srinath Adusumalli;
David Asch;
Peter Groeneveld;
Lyle Ungar;
Raina Merchant

ABSTRACT

Background:

Current Atherosclerotic cardiovascular disease (ASCVD) predictive models have limitations, efforts are underway to improve the discriminatory power of ASCVD models.

Objective:

We sought to evaluate the discriminatory power of using social media posts to predict 10-year risk for ASCVD as compared to the pooled cohort risk equations (PCEs)

Methods:

We consented patients receiving care in an urban academic emergency department to share access to their Facebook posts and electronic medical records (EMR). We retrieved Facebook status updates up to 5-years prior to study enrollment for all consenting patients. We identified patients (n=181) without a prior history of coronary heart disease, an ASCVD score in their EMR, and more than 200 words in their Facebook posts. Using Facebook posts from these patients, we applied a machine learning (ML) model to predict 10-year ASCVD risk scores. Using a ML model and a psycholinguistic dictionary, Linguistic Inquiry and Word Count (LIWC) we evaluated if language from posts alone could predict differences in risk scores and the association of certain words with risk categories, respectively.

Results:

A ML model predicted the 10-year ASCVD risk scores for these categories: <5%, 5% - 7.4%, 7.5% - 9.9%, and >=10% with AUC’s: 0.78, 0.57, 0.72, and 0.61, respectively. A ML model distinguished between low risk (<10%) and high risk (>10%) with an AUC of 0.69. Additionally, a ML model predicted the ASCVD risk score with Pearson’s r = 0.26. Using LIWC, patients with higher ASCVD scores were more likely to use words associated with sadness (Pearson’s r = 0.32).

Conclusions:

Language used on social media can provide insights about an individual’s ASCVD risk and inform approaches to risk modification.

Citation

Please cite as:

Andy A, Guntuku S, Adusumalli S, Asch D, Groeneveld P, Ungar L, Merchant R

Predicting Cardiovascular Risk Using Social Media Data: Performance Evaluation of Machine-Learning Models

JMIR Cardio 2021;5(1):e24473

DOI: 10.2196/24473

PMID: 33605888

PMCID: 8411430

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Cardio

Date Submitted: Sep 21, 2020

Date Accepted: Jan 15, 2021

Predicting cardiovascular risk using data from social media

ABSTRACT

Citation

Copyright