Accepted for/Published in: JMIR Mental Health
Date Submitted: May 31, 2022
Date Accepted: Oct 28, 2022
Cross-Platform Detection Of Psychiatric Hospitalization Via Social Media Data: A Comparison Study
ABSTRACT
Background:
Previous research has shown the feasibility of utilizing social media data from a singular platform (e.g., Facebook or Twitter) in distinguishing individuals with a diagnosis of mental illness or experiencing an adverse outcome from healthy volunteers. However, the performance of these models on data from other social media platforms unseen in the training data (e.g., Instagram, TikTok) have not been investigated.
Objective:
This study aims to explore if online identities fragmented across social media platforms, models would have better testing performance on data from already seen social media platforms, in comparison to unseen social media platforms. It also aims to explain such discrepancies in performances if they are found.
Methods:
Windowed timeline data from three platforms with clinically-verified labels of hospitalization among patients with a diagnosis of schizophrenia was gathered: Facebook (N = 254), Twitter (N = 54), and Instagram (N = 124). Then, we utilized a 3 x 3 combinatorial binary classification design to test model’s performance on testing data from all available platforms. We further compared results from models within intra-platform experiments (i.e., training and testing data belongs to the same platform) to models within inter-platform experiments (i.e., training and testing data belongs to the different platforms). Finally, we utilized SHapley Additive exPlanations (SHAP) to extract top predictive features to explain the underlying constructs that predict hospitalization on each platform.
Results:
We found that models within intra-platform experiments on average achieved an F1-score of 0.72 in detection a psychiatric hospitalization due to schizophrenia, which is 68% higher compared to the average of models within inter-platform experiments at an F1-score of 0.428. We also found that by combining training data of all three platforms, a slight improvement of 0.5% was observed on the testing sets on average, compared to original intra-platform models. An analysis of top features for the intra-platform models shows low predictive feature overlap between the platforms, with ‘anger’ being an unique top feature for Facebook while ‘sad’ being an unique top feature for Instagram.
Conclusions:
We demonstrated models built on one platform’s data to predict critical mental health treatment outcomes, such as a hospitalization, may not generalize to another, because each platform offers different construct validity. However, combining data from multiple platforms together may offer a more comprehensive view of a patient’s state and situation, and therefore fare better in relapse prediction. With the changing ecosystem of social media use among different demographic groups and as online identities continue to get fragmented across platforms, further research on holistic approaches to harnessing these diverse data sources is required.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.