JMIR Preprints #81644: Understanding Transformer-based Classifications of Medical Text: Proof of Concept using LLM for Attribution of Feature Importance

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Understanding Transformer-based Classifications of Medical Text: Proof of Concept using LLM for Attribution of Feature Importance

Fangwen Zhou;
Ashirbani Saha;
Muhammad Afzal;
Rick Parrish;
R. Brian Haynes;
Alfonso Iorio;
Cynthia Lokker

ABSTRACT

Background:

Deep learning has demonstrated excellent performance in biomedical literature classification. However, the opacity of these models’ decision-making processes limits their interpretability and adoption. Explainable artificial intelligence (XAI) methods, including SHapley Additive exPlanations (SHAP) and integrated gradients (IG), have been proposed to address this issue, yet computational complexity remains high. Generative large language models (LLMs) may offer a novel approach for generating interpretable and context-aware explanations.

Objective:

To investigate the effectiveness of Generative Pre-trained Transformer (GPT) -4o as a perturbation-based explainer for a BioLinkBERT text classifier by comparing its explanations to SHAP partition explainer and IG in terms of faithfulness.

Methods:

A stratified sample of 200 articles from McMaster PLUS and Clinical Hedges databases was classified by BioLinkBERT. GPT-4o, SHAP partition explainer, and IG were used to generate token-level feature attributions. GPT-based explanations were derived through iterative masking perturbation. Explanations were evaluated using a modified version of the area over the perturbation curve (AOPC), correlation analyses, and qualitative assessment of feature importance attribution.

Results:

SHAP (AOPC 0.222; 95% confidence interval [CI] 0.200 to 0.244) and IG (AOPC 0.225; 95% CI 0.202 to 0.247) provided consistent and faithful explanations, effectively identifying tokens relevant to study rigour (e.g., "randomized," "blind"). Conversely, GPT-4o explanations were poor (AOPC 0.029; 95% CI 0.014 to 0.043) with nonsensical token attributions. Correlation analysis showed moderate alignment between SHAP and IG (Pearson’s r 0.367), whereas GPT-4o had minimal (Pearson’s r ≤0.032) correlation with these established methods.

Conclusions:

GPT-4o, despite its advanced contextual capabilities, performed poorly as a standalone explainer compared to established methods like SHAP and IG. These findings highlight the need for further research into specialized prompt engineering and potential hybrid methods integrating LLMs with traditional XAI techniques to improve interpretability without sacrificing computational efficiency or explanation quality.

Citation

Please cite as:

Zhou F, Saha A, Afzal M, Parrish R, Haynes RB, Iorio A, Lokker C

Understanding Transformer-Based Classifications of Medical Text Using a Large Language Model for the Attribution of Feature Importance: Proof-of-Concept Algorithm Development and Validation Study

JMIR Med Inform 2026;14:e81644

DOI: 10.2196/81644

PMID: 42268907

PMCID: 13252589

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Jul 31, 2025

Date Accepted: Apr 29, 2026

Understanding Transformer-based Classifications of Medical Text: Proof of Concept using LLM for Attribution of Feature Importance

ABSTRACT

Citation

Copyright