Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Cardio

Date Submitted: Sep 21, 2020
Date Accepted: Jan 15, 2021

The final, peer-reviewed published version of this preprint can be found here:

Predicting Cardiovascular Risk Using Social Media Data: Performance Evaluation of Machine-Learning Models

Andy A, Guntuku S, Adusumalli S, Asch D, Groeneveld P, Ungar L, Merchant R

Predicting Cardiovascular Risk Using Social Media Data: Performance Evaluation of Machine-Learning Models

JMIR Cardio 2021;5(1):e24473

DOI: 10.2196/24473

PMID: 33605888

PMCID: 8411430

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Predicting cardiovascular risk using data from social media

  • Anietie Andy; 
  • Sharath Guntuku; 
  • Srinath Adusumalli; 
  • David Asch; 
  • Peter Groeneveld; 
  • Lyle Ungar; 
  • Raina Merchant

ABSTRACT

Background:

Current Atherosclerotic cardiovascular disease (ASCVD) predictive models have limitations, efforts are underway to improve the discriminatory power of ASCVD models.

Objective:

We sought to evaluate the discriminatory power of using social media posts to predict 10-year risk for ASCVD as compared to the pooled cohort risk equations (PCEs)

Methods:

We consented patients receiving care in an urban academic emergency department to share access to their Facebook posts and electronic medical records (EMR). We retrieved Facebook status updates up to 5-years prior to study enrollment for all consenting patients. We identified patients (n=181) without a prior history of coronary heart disease, an ASCVD score in their EMR, and more than 200 words in their Facebook posts. Using Facebook posts from these patients, we applied a machine learning (ML) model to predict 10-year ASCVD risk scores. Using a ML model and a psycholinguistic dictionary, Linguistic Inquiry and Word Count (LIWC) we evaluated if language from posts alone could predict differences in risk scores and the association of certain words with risk categories, respectively.

Results:

A ML model predicted the 10-year ASCVD risk scores for these categories: <5%, 5% - 7.4%, 7.5% - 9.9%, and >=10% with AUC’s: 0.78, 0.57, 0.72, and 0.61, respectively. A ML model distinguished between low risk (<10%) and high risk (>10%) with an AUC of 0.69. Additionally, a ML model predicted the ASCVD risk score with Pearson’s r = 0.26. Using LIWC, patients with higher ASCVD scores were more likely to use words associated with sadness (Pearson’s r = 0.32).

Conclusions:

Language used on social media can provide insights about an individual’s ASCVD risk and inform approaches to risk modification.


 Citation

Please cite as:

Andy A, Guntuku S, Adusumalli S, Asch D, Groeneveld P, Ungar L, Merchant R

Predicting Cardiovascular Risk Using Social Media Data: Performance Evaluation of Machine-Learning Models

JMIR Cardio 2021;5(1):e24473

DOI: 10.2196/24473

PMID: 33605888

PMCID: 8411430

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.