Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Aug 2, 2023
Date Accepted: Nov 27, 2023
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Text dialogue analysis Based ChatGPT for Primary Screening of Mild Cognitive Impairment
ABSTRACT
Background:
AI models tailored to diagnose cognitive impairment have shown excellent results. However, it is unclear whether large linguistic models can rival specialized models by text alone.
Objective:
We would explore the effectiveness of ChatGPT for primary screening of mild cognitive impairment (MCI) and standardize the design steps and components of the prompt.
Methods:
We obtained 174 participants from the DementiaBank screening and classified 70% of them into the training set and 30% into the test set. Three dimensions of variables were incorporated, including: vocabulary, syntax and grammar, and semantics. These variables were generated from published studies and statistical analyses. We used R 4.3.0. for the analysis of variables and diagnostic indicators.
Results:
The final variables included by published studies included: word frequency and word ratio, phrase frequency and phrase ratio, lexical complexity, syntactic complexity, grammatical components, semantic density, and semantic coherence; variables included in the analysis included: tip-of-the-tongue phenomenon (P < 0.001), difficulty with complex ideas (P < 0.001), and memory issues (P < 0.001). The final GPT4 model achieved the sensitivity (SEN) of 0.8636, specificity (SPE) of 0.9487 and area under the curve (AUC) of 0.9062 on the training set; on the test set, the SEN, SPE and AUC reached 0.7727, 0.8333 and 0.8030, respectively. The prompt consisted of five main parts, including character setting, scoring system setting, indicator setting, output setting, and explanatory information setting.
Conclusions:
ChatGPT was effective in primary screening of participants with possible MCI. Improved standardization of prompts by professional clinicians would also improve the performance of the model. It is important to note that ChatGPT is not a substitute for a clinician making a diagnosis.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.