Accepted for/Published in: JMIR Formative Research
Date Submitted: Nov 6, 2023
Date Accepted: Sep 23, 2024
Early Identification of Cognitive Impairment in Community Environments Through Modeling Subtle Inconsistencies in Questionnaire Responses: Machine Learning Model Development and Validation
ABSTRACT
Background:
The underdiagnosis of cognitive impairment hinders timely prevention and intervention of dementia. Health professionals working in communities play a critical role in the early detection of CI, yet still face several challenges such as a lack of suitable tools, necessary training, and potential stigmatization.
Objective:
This study explored a novel application integrating psychometric methods with data science techniques to model subtle mistakes in questionnaire response data for enhancing early identification of CI in community environments.
Methods:
This study analyzed questionnaire response data from participants aged 50 years and older in the Health and Retirement Study (Waves 8-9, n=12942). Predictors included low-quality response (LQR) indices generated using the graded response model from four brief questionnaires (Optimism, Hopelessness, Purpose in life and Life satisfaction) assessing aspects of overall well-being, a focus of health professionals in communities. The primary and supplemental predicted outcomes were current CI derived from a validated criterion and dementia or mortality in the next ten years. Multilayer perceptron (MLP) was employed as the predictive model, and its performance was compared with six different predictive methods.
Results:
The MLP exhibited the best performance in predicting current CI across questionnaires. In the selected four questionnaires, the area under curve (AUC) values for identifying current CI ranged from 0.63~0.66 and were improved to 0.71~0.74 when combining the LQR indices with age and gender for prediction. We set the threshold for assessing CI risk in the tool based on the ratio of underdiagnosis costs to overdiagnosis costs, and a ratio of 4 was used as the default choice. In addition, the tool outperformed the efficiency of age or health-based screening strategies for identifying individuals at high risk of CI. This tool has been deployed on a portal website for the public to access freely.
Conclusions:
We developed a novel machine learning tool that integrates psychometric methods with data science to facilitate "passive/backend" CI assessments in community settings, aiming to promote early CI detection. This tool simplifies the CI assessment process, making it more adaptable and reducing both the professional and community burdens. Our approach also presents a new perspective for utilizing questionnaire data: leveraging, rather than dismissing, low-quality data.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.