Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Formative Research

Date Submitted: Jan 27, 2022
Open Peer Review Period: Jan 18, 2022 - Mar 15, 2022
Date Accepted: Dec 5, 2022
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation

Wen B, Wang N, Subbalakshmi K, Chandramouli R

Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation

JMIR Form Res 2023;7:e36590

DOI: 10.2196/36590

PMID: 37129944

PMCID: 10189619

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

One-intervention Causal Explanation for Natural Language Processing Based Alzheimer’s Disease Detection

  • Bingyang Wen; 
  • Ning Wang; 
  • K.P. Subbalakshmi; 
  • R. Chandramouli

ABSTRACT

Background:

Machine learning-based Alzheimer's detection using natural language processing has drawn increasing attention because of its low cost compared with traditional methods. However, most of these models are black-boxes, and the decision mechanisms of the AI are obscure. In some fields like medicine, this obscurity gets in the way of widespread adoption. This has led to the development of a new class of techniques that are generally referred to as explainable AI (XAI). One approach to this problem is counter-factual explanations which answer “what if” questions like “What would have happened to Y, had I not done X?”.

Objective:

This study aims to improve the transparency of a the-state-of-art language-based Alzheimer’s disease (AD) detection model and discover linguistic biomarkers that are indicative of AD and hence can be used as tools for automated diagnosis of AD.

Methods:

In this paper, a new explainable artificial intelligence (XAI) method is proposed and named one-intervention counterfactual explanation (OICE). This method works on the state-of-the-art language-based, deep learning method for AD detection and provides an explanation of that method. The proposed OICE incorporates causal factors among the features used in the detection of AD, to provide more transparency of the AI’s decision. This is in contrast to conventional counterfactual explanation methods which do not incorporate causal mechanisms. An understanding of causal factors can go beyond mere statistical correlation to provide a better understanding of the underlying physical phenomenon. The proposed OICE generates counterfactual explanations from a predefined deep-based structural causal model (SCM). The proposed method generated explanations of the AI’s decision by only intervening on one feature at a time. Since OICE provides explanations for individual samples, we then analyze the counterfactual explanations statistically and define some metrics to quantify the effect of every feature.

Results:

We find 11 language level biomarkers for Alzheimer’s disease detection such as adverb, pronoun, noun, preposition, etc. Previous work in psychology and NLP points out adverbs, pronouns, and nouns as potential biomarkers. Our study concurs. We also find new biomarkers that were not reported in previous studies, such as preposition, predeterminer, etc. Our results also reveal how these biomarkers are involved in the diagnostic process from a causal perspective. For example, an on-average 20.2% increase in predeterminer, causes determiner, verb (present particle), and grammatical particles change, resulting in flipping in the diagnosis from control to Alzheimer’s disease. This implies that predeterminer is potentially a strong indicator of the individual’s health and can function as a strong biomarker.

Conclusions:

Our findings show consistency with previous works in psychology and natural language processing (NLP). Additionally, we offer a new explanation about how intervening a feature can affect the model's decisions using the pre-defined SCM.


 Citation

Please cite as:

Wen B, Wang N, Subbalakshmi K, Chandramouli R

Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation

JMIR Form Res 2023;7:e36590

DOI: 10.2196/36590

PMID: 37129944

PMCID: 10189619

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.