Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Dec 29, 2023
Date Accepted: Mar 25, 2024
Machine learning–based prediction of suicidal thinking in adolescents: Derivation and validation in three independent worldwide cohorts in South Korea, Norway, and the USA
ABSTRACT
Background:
Suicide is the second leading cause of death among adolescents and is associated with clusters of suicides. Despite numerous researches on this preventable cause of death, the focus has primarily been on single nations and traditional statistical methods.
Objective:
This study aims to develop a predictive model for adolescent suicidal thinking using multinational datasets and machine learning (ML).
Methods:
This study utilized data from the Korea Youth Risk Behavior Web–based Survey (KYRBS) with 566,875 adolescents aged 13 to 18 and conducted external validation using the Youth Risk Behavior Survey (YRBS) with 103,874 adolescents and Norway's University National General Survey (Ungdata) with 19,574 adolescents. Several tree–based ML models were developed and feature importance and SHapley Additive exPlanations (SHAP) values were analyzed to identify risk factors for adolescent suicidal thinking.
Results:
When trained on the KYRBS data from South Korea with a 95% confidence interval, the XGBoost model reported an area under the receiver operating characteristic curve (AUROC) of 90.06% (95% CI, 89.97–90.16), displaying superior performance compared to other models. For external validation using the YRBS data from the USA and the Ungdata from Norway, the XGBoost model achieved an AUROC of 83.09% and 81.27%, respectively. Across all datasets, XGBoost consistently outperformed the other models with the highest AUROC score, selected as the most optimal model. In terms of predictors of suicidal thinking, feelings of sadness and despair were the most influential, accounting for 57.4% of the impact, followed by stress status at 19.8%. This was followed by age (5.7%), household income (4.0%), academic achievement (3.4%), sex (2.1%), and others contributing less than 2% each.
Conclusions:
To address adolescent suicide, this study utilized ML by integrating diverse datasets from three countries. The findings highlight the important role of emotional health indicators in predicting suicidal thinking among adolescents. Specifically, sadness and despair were identified as the most significant predictors, followed by stressful conditions and age. These findings emphasize the critical need for early diagnosis and prevention for mental health issues during adolescence.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.