Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Feb 25, 2020
Date Accepted: Nov 11, 2020
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Can the Use of Attention Mechanisms be Assured in Clinical Research?: Evaluation of Current Design Approaches of Attention Mechanisms in Deep Learning Algorithms
ABSTRACT
Background:
Despite excellent prediction performance, non-interpretability has undermined the value of applying deep learning algorithms in clinical practice. To overcome this limitation, an explanatory modeling method called attention mechanism has been introduced to clinical research. However, gentle guidance and precautions for using this attractive method have not been well provided to clinical and informatics researchers. Furthermore, there has been a lack of discussion on the predictive and interpretive performance of this method when applied to health data.
Objective:
The purpose of this study is to provide clinical researchers with the basic concepts and design approaches of attention mechanisms. In addition, the study aims to evaluate current design approaches of attention mechanisms in terms of prediction and interpretability performance.
Methods:
First, the basic concepts and several key considerations regarding attention mechanisms are provided. Second, the four approaches to attention mechanisms are introduced according to a two-dimensional framework based on degree of freedom and uncertainty awareness. Third, 1) prediction performance, 2) probability reliability, 3) concentration of variable importance, 4) consistency of attention results, and 5) generalizability of attention results to conventional statistics, are assessed in the diabetic classification modeling setting. Fourth, the performances of the four attention design approaches are discussed.
Results:
Prediction performance was very high for all models. Probability reliability was high in models with a high degree of freedom. Variable importance was concentrated in several variables when uncertainty awareness was not considered. Consistency of attention results was high when uncertainty awareness was considered. The generalizability of attention results to conventional statistics was poor regardless of the modeling approach.
Conclusions:
The attention mechanism is obviously an attractive technique, which could be very promising in the future. However, naive attention implementations may lead to poor results when determining variable importance. Therefore, more robust theoretical studies of attention mechanisms should be encouraged.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.