Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Formative Research

Date Submitted: Aug 19, 2025
Date Accepted: Mar 6, 2026

The final, peer-reviewed published version of this preprint can be found here:

Artificial Intelligence Design for Race-Based Prostate Cancer Stage Classification With Multilayer Perceptron: Feature Selection Optimization Approach

Mulia A, Agustriawan D, Overbeek M, Widjaja M, Kurniawan V, Syechlo J, Ahmad MI, Sathipati SY, Kurubanjerdjit N

Artificial Intelligence Design for Race-Based Prostate Cancer Stage Classification With Multilayer Perceptron: Feature Selection Optimization Approach

JMIR Form Res 2026;10:e82587

DOI: 10.2196/82587

PMID: 41989963

Artificial Intelligence Design for Racial-based Prostate Cancer Stage Classification with Multi-Layer Perceptron: Feature Selection Optimization Approach

  • Adithama Mulia; 
  • David Agustriawan; 
  • Marlinda Overbeek; 
  • Moeljono Widjaja; 
  • Vincent Kurniawan; 
  • Jheno Syechlo; 
  • Muhammad Imran Ahmad; 
  • Srinivasulu Yerukala Sathipati; 
  • Nilubon Kurubanjerdjit

ABSTRACT

Background:

Prostate cancer progression exhibits significant variability influenced by biological and racial factors. DNA methylation profiling has shown potential in early cancer detection, but its integration with machine learning across racially diverse populations remains limited.

Objective:

This study aims to develop a race-aware framework using DNA methylation data and a Multi-Layer Perceptron (MLP) model to classify prostate cancer stages into early (I–II) and late (III–IV) stages.

Methods:

Methylation and phenotype data from the TCGA-PRAD dataset were processed using Differentially Methylated Positions (DMP) analysis to identify CpG sites correlated with cancer stages. These features were further refined through Recursive Feature Elimination (RFE) and used to train MLP models. SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME) were used to interpret the model and identify key DNA methylation features contributing to model predictions.

Results:

The best-performing model achieved ~95% accuracy and up to 99% AUC on the majority race (White) training data using 70 selected features. However, performance declined sharply on minority race groups, revealing the effects of sample imbalance and race-specific methylation patterns. Feature importance examination indicates strong patterns within certain CpG sites driving the models predictions.

Conclusions:

We propose a race-aware MLP model for prostate cancer stage classification using DNA methylation data, optimized through DMP and RFE-based feature selection. SHAP and LIME confirmed the predictive relevance of selected CpG sites, supporting model transparency. Results highlight high performance within the White cohort but reveal poor generalization to minority groups, emphasizing the importance of race-specific modeling strategies.


 Citation

Please cite as:

Mulia A, Agustriawan D, Overbeek M, Widjaja M, Kurniawan V, Syechlo J, Ahmad MI, Sathipati SY, Kurubanjerdjit N

Artificial Intelligence Design for Race-Based Prostate Cancer Stage Classification With Multilayer Perceptron: Feature Selection Optimization Approach

JMIR Form Res 2026;10:e82587

DOI: 10.2196/82587

PMID: 41989963

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.