Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR AI

Date Submitted: Jul 30, 2025
Open Peer Review Period: Aug 1, 2025 - Sep 26, 2025
Date Accepted: Dec 5, 2025
Date Submitted to PubMed: Dec 8, 2025
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Accelerating Discovery of Leukemia Inhibitors Using AI-Driven Quantitative Structure-Activity Relationship: Algorithm Development and Validation

Kakraba S, Reis RJ

Accelerating Discovery of Leukemia Inhibitors Using AI-Driven Quantitative Structure-Activity Relationship: Algorithm Development and Validation

JMIR AI 2026;5:e81552

DOI: 10.2196/81552

PMID: 41358925

PMCID: 12892034

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Accelerating Discovery of Leukemia Inhibitors with AI-Driven QSAR Modeling

  • Samuel Kakraba; 
  • Robert J.S. Reis

ABSTRACT

Background:

Leukemia treatment remains a major challenge in oncology. While Thiadiazolidinone (TDZD) analogs show potential to inhibit leukemia cell proliferation, they often lack sufficient potency and selectivity. Traditional drug discovery struggles to efficiently explore the vast chemical landscape, highlighting the need for innovative computational strategies. Machine learning (ML)-enhanced QSAR modeling offers a promising route to identify and optimize inhibitors with improved activity and specificity.

Objective:

To develop and validate an integrated machine learning–enhanced QSAR modeling workflow for the rational design and prediction of Thiadiazolidinone (TDZD) analogs with improved anti-leukemia activity, by systematically evaluating molecular descriptors and algorithmic approaches to identify key determinants of potency and guide future inhibitor optimization.

Methods:

We analyzed 35 TDZD derivatives with confirmed anti-leukemia activity, removing outliers for data quality. Using Schrödinger MAESTRO, we calculated 220 molecular descriptors (1D–4D). Seventeen ML models, including Random Forests, XGBoost, and Neural Networks, were trained on 70% of data and tested on 30%, using stratified sampling. Model performance was assessed with 12 metrics, including MSE, R², and SHAP values, and optimized via hyperparameter tuning and 5-fold cross-validation.

Results:

Ensemble methods, especially LightGBM and Random Forest, showed superior predictive performance (LightGBM: MSE = 0.00063 ± 0.00012; R² = 0.971 ± 0.0084). Isotonic Regression ranked second, outperforming baseline models by over 15% in explained variance. SHAP analysis identified hydrogen bond acceptor count (r_qp_accptHB), electronic properties, and solubility as key features for anti-leukemia activity.

Conclusions:

Integrating ML with QSAR modeling refines leukemia inhibitors and enhances prediction accuracy while revealing underlying mechanisms. This approach accelerates identification of potent compounds and offers a pathway to overcome therapeutic resistance in leukemia.


 Citation

Please cite as:

Kakraba S, Reis RJ

Accelerating Discovery of Leukemia Inhibitors Using AI-Driven Quantitative Structure-Activity Relationship: Algorithm Development and Validation

JMIR AI 2026;5:e81552

DOI: 10.2196/81552

PMID: 41358925

PMCID: 12892034

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.