Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR AI

Date Submitted: Jul 28, 2024
Date Accepted: Mar 26, 2025

The final, peer-reviewed published version of this preprint can be found here:

Clinical Laboratory Parameter–Driven Machine Learning for Participant Selection in Bioequivalence Studies Among Patients With Gastric Cancer: Framework Development and Validation Study

Seong SJ, Shon B, Choi EJ, Gwon MR, Lee HW, Park J, Chung HY, Jeong S, Yoon YR

Clinical Laboratory Parameter–Driven Machine Learning for Participant Selection in Bioequivalence Studies Among Patients With Gastric Cancer: Framework Development and Validation Study

JMIR AI 2025;4:e64845

DOI: 10.2196/64845

PMID: 40605831

PMCID: 12223687

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

A Machine Learning-based Framework Using Clinical Laboratory Parameters to Support Participant Enrollment in Clinical Trials

  • Sook Jin Seong; 
  • Byungeun Shon; 
  • Eun Jung Choi; 
  • Mi-Ri Gwon; 
  • Hae Won Lee; 
  • Jaechan Park; 
  • Ho-Young Chung; 
  • Sungmoon Jeong; 
  • Young-Ran Yoon

ABSTRACT

Background:

Insufficient participant enrollment is a major factor responsible for clinical trial failure.

Objective:

We formulated a machine learning (ML)-based framework using clinical laboratory parameters to identify participants eligible for enrollment in clinical trials.

Methods:

We acquired records of 11,592 patients with gastric cancer from the electronic medical records of Kyungpook National University Hospital in Korea from 2011 to 2019. Eight clinical laboratory parameters, including hemoglobin, neutrophil count, platelet count, total bilirubin, aspartate aminotransferase, alanine aminotransferase, alkaline phosphatase, and creatinine, and their acquisition dates were used for ML model development. The dataset was divided into training and test sets: a training dataset was collected from 2011 to 2018 to design an ML-based candidate selection method, and a test dataset was collected in 2019 to evaluate the performance of the proposed method. The generalization performance of the ML-based method was confirmed using the F1 score and the area under the curve (AUC). The proposed model was compared with a random selection method to evaluate its efficacy in recruiting participants.

Results:

The receiver operating characteristic curves of each clinical parameter ranged from 0.789 to 0.915, confirming the good performance of test results. Using ML, we extracted patients in the order of the highest predicted probability and identified valid candidates. The proposed ML model identified valid clinical trial subjects faster than random selection methods, demonstrating a maximum workload reduction of 57%.

Conclusions:

The proposed ML-based framework using clinical laboratory parameters can be used to identify patients eligible for a clinical trial, enabling faster participant enrollment. Clinical Trial: KNUH 2020-04-023


 Citation

Please cite as:

Seong SJ, Shon B, Choi EJ, Gwon MR, Lee HW, Park J, Chung HY, Jeong S, Yoon YR

Clinical Laboratory Parameter–Driven Machine Learning for Participant Selection in Bioequivalence Studies Among Patients With Gastric Cancer: Framework Development and Validation Study

JMIR AI 2025;4:e64845

DOI: 10.2196/64845

PMID: 40605831

PMCID: 12223687

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.