JMIR Preprints #81552: Accelerating Discovery of Leukemia Inhibitors with AI-Driven QSAR Modeling

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Accelerating Discovery of Leukemia Inhibitors with AI-Driven QSAR Modeling

Samuel Kakraba;
Robert J.S. Reis

ABSTRACT

Background:

Leukemia treatment remains a major challenge in oncology. While Thiadiazolidinone (TDZD) analogs show potential to inhibit leukemia cell proliferation, they often lack sufficient potency and selectivity. Traditional drug discovery struggles to efficiently explore the vast chemical landscape, highlighting the need for innovative computational strategies. Machine learning (ML)-enhanced QSAR modeling offers a promising route to identify and optimize inhibitors with improved activity and specificity.

Objective:

To develop and validate an integrated machine learning–enhanced QSAR modeling workflow for the rational design and prediction of Thiadiazolidinone (TDZD) analogs with improved anti-leukemia activity, by systematically evaluating molecular descriptors and algorithmic approaches to identify key determinants of potency and guide future inhibitor optimization.

Methods:

We analyzed 35 TDZD derivatives with confirmed anti-leukemia activity, removing outliers for data quality. Using Schrödinger MAESTRO, we calculated 220 molecular descriptors (1D–4D). Seventeen ML models, including Random Forests, XGBoost, and Neural Networks, were trained on 70% of data and tested on 30%, using stratified sampling. Model performance was assessed with 12 metrics, including MSE, R², and SHAP values, and optimized via hyperparameter tuning and 5-fold cross-validation.

Results:

Ensemble methods, especially LightGBM and Random Forest, showed superior predictive performance (LightGBM: MSE = 0.00063 ± 0.00012; R² = 0.971 ± 0.0084). Isotonic Regression ranked second, outperforming baseline models by over 15% in explained variance. SHAP analysis identified hydrogen bond acceptor count (r_qp_accptHB), electronic properties, and solubility as key features for anti-leukemia activity.

Conclusions:

Integrating ML with QSAR modeling refines leukemia inhibitors and enhances prediction accuracy while revealing underlying mechanisms. This approach accelerates identification of potent compounds and offers a pathway to overcome therapeutic resistance in leukemia.

Citation

Please cite as:

Kakraba S, Reis RJ

Accelerating Discovery of Leukemia Inhibitors Using AI-Driven Quantitative Structure-Activity Relationship: Algorithm Development and Validation

JMIR AI 2026;5:e81552

DOI: 10.2196/81552

PMID: 41358925

PMCID: 12892034

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR AI

Date Submitted: Jul 30, 2025

Open Peer Review Period: Aug 1, 2025 - Sep 26, 2025

Date Accepted: Dec 5, 2025

Date Submitted to PubMed: Dec 8, 2025

(closed for review but you can still tweet)

Accelerating Discovery of Leukemia Inhibitors with AI-Driven QSAR Modeling

ABSTRACT

Citation

Copyright