Currently submitted to: JMIR Medical Informatics
Date Submitted: Jun 4, 2026
Open Peer Review Period: Jun 17, 2026 - Aug 12, 2026
(currently open for review)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Answering the Hard Questions in GIST: A Systematic Review of Artificial Intelligence Applications in Diagnosis, Risk Stratification, Prognosis, and Metastatic Prediction.
ABSTRACT
Background:
In the treatment of Gastrointestinal Stromal Tumors (GISTs) patients, clinicians face significant challenges, particularly in accurate diagnosis, preoperative risk stratification, differentiation from other subepithelial lesions (SELs), and precise prediction of recurrence and metastasis. Traditional diagnostic and prognostic methods often face limitations such as subjectivity, variability, and insufficient accuracy, leading to diagnostic ambiguities, potential over treatment, and sub-optimal patient outcomes.Addressing these gaps is essential to improving clinical decision-making and advancing precision oncology in GIST. Thus, the growing interest in data-driven approaches such as artificial intelligence to enhance diagnostic accuracy, risk stratification, and outcome prediction
Objective:
Therefore, this review aims to systematically evaluate the current applications of AI, ML, and DL in GIST, with the main aim to compare the most commonly used algorithms, data modalities, prediction tasks, validation approaches, and reported performance across studies. Additionally, our goal is to identify gaps in clinical applicability, external validation and interpretability. Furthermore, we aim to identify key gaps and clinical applicability that may hinder the translation of these technologies into routine clinical practice.
Methods:
This systematic review identified 65 original research studies developing Artificial Intelligence or Machine Learning or Deep Learning models for GIST clinical applications, published between 2011 and 2026. Deep learning architectures, particularly Convolutional Neural Networks (ResNet, EfficientNet, Vision Transformers), were most common for image analysis, while traditional machine learning (Random Forest, SVM, XGBoost) dominated radiomics-based approaches
Results:
Diagnostic models achieved the highest performance, with EUS-based approaches reaching 86-96% accuracy, while risk stratification models showed more variable results, particularly for intermediate-risk categories (AUCs 0.64-0.78). Prognostic models demonstrated C-indices of 0.72-0.86, and metastasis prediction models achieved AUCs of 0.87-0.96. External validation was conducted in only 29 of 66 studies, with consistent performance degradation compared to internal validation. Most studies were conducted in Chinese populations (n=46), with limited geographic and ethnic diversity. Single-center studies (n=37) predominated over multi-center collaborations (n=23).
Conclusions:
While AI models demonstrate technical feasibility and promising performance in controlled settings. The evidence base lacks the validation rigor, prospective evaluation, population diversity, and clinical integration necessary for confident clinical deployment in GIST care. Major limitations included selection bias from retrospective designs (62 of 65 studies), technical heterogeneity in imaging protocols affecting reproducibility, and class imbalance particularly affecting intermediate-risk predictions. There was a lack of prospective validation (only 1 of 65 studies), limited use interpretability methods (SHAP/LIME used in only 16 studies), limited assessment of clinical utility beyond accuracy metrics such as sensitivity and specificity , and absence of real-world implementation data. All 65 studies positioned their models as aids to clinical decision-making rather than replacements for physician judgment. Clinical Trial: n/a
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.