Previously submitted to: JMIR Cancer (no longer under consideration since Jan 09, 2026)
Date Submitted: Jun 17, 2025
(closed for review but you can still tweet)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Prediction of KRAS, NRAS, BRAF, and HER2 Status in Colorectal Cancer Based on Histopathology Images Via Weakly Supervised Deep Learning
ABSTRACT
Background:
Research has shown that mutations in the KRAS, NRAS, and BRAF genes are linked to resistance to anti-EGFR therapies in colorectal cancer (CRC) patients. HER2-targeted therapies are increasingly being recommended for individuals with HER2 overexpression.
Objective:
The evaluation of KRAS, NRAS, BRAF, and HER2 statuses has become an important part of precise diagnosis for CRC. However, conventional molecular or protein testing can be time-consuming and expensive. This study aims to predict the status of KRAS, NRAS, BRAF, and HER2 through the analysis of whole-slide pathology features from CRC samples stained with Hematoxylin-Eosin (H&E) for KRAS, NRAS, and BRAF, and by utilizing Immunohistochemistry (IHC) for HER2.
Methods:
In this study, 435 CRC patients were enrolled from Jiangsu Province Hospital of Chinese Medicine. Using the clustering-constrained attention-based multiple-instance learning (CLAM) model, we constructed four models for predicting the statuses of KRAS, NRAS, BRAF, and HER2 based on whole-slide images (WSIs).
Results:
Our proposed four CLAM models demonstrated encouraging predictive performance, with all AUC values exceeding 0.88. Our model-generated heatmaps showing KRAS, NRAS, BRAF mutation patterns and HER2 expression levels generally matched the regions identified by the pathologists.
Conclusions:
Our method provides new insights to predict gene mutations and protein expression using deep learning. These predictions can act as a prescreening tool, improving cost efficiency before the use of next-generation sequencing (NGS), amplification refractory mutation system-polymerase chain reaction (ARMS-PCR) and Immunohistochemistry (IHC). This approach ultimately enhances the effectiveness of precision medicine and improves the consistency of quality in physicians’ slide evaluations.
Citation