Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Sep 24, 2024
Open Peer Review Period: Sep 24, 2024 - Oct 9, 2024
Date Accepted: Jan 31, 2025
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Augmenting Insufficiently Accruing Oncology Clinical Trials Using Generative Models: Validation Study

El Kababji S, Mitsakakis N, Jonker E, Beltran-Bless AA, Pond G, Vandermeer L, Radhakrishnan D, Mosquera L, Paterson A, Shepherd L, Chen B, Barlow W, Gralow J, Savard MF, Fesl C, Hlauschek D, Balic M, Rinnerthaler G, Greil R, Gnant M, Clemons M, El Emam K

Augmenting Insufficiently Accruing Oncology Clinical Trials Using Generative Models: Validation Study

J Med Internet Res 2025;27:e66821

DOI: 10.2196/66821

PMID: 40053790

PMCID: 11923467

Augmenting Insufficiently Accruing Oncology Clinical Trials Using Generative Models: A Validation Study

  • Samer El Kababji; 
  • Nicholas Mitsakakis; 
  • Elizabeth Jonker; 
  • Ana-Alicia Beltran-Bless; 
  • Greg Pond; 
  • Lisa Vandermeer; 
  • Dhenuka Radhakrishnan; 
  • Lucy Mosquera; 
  • Alexander Paterson; 
  • Lois Shepherd; 
  • Bingshu Chen; 
  • William Barlow; 
  • Julie Gralow; 
  • Marie-France Savard; 
  • Christian Fesl; 
  • Dominik Hlauschek; 
  • Marija Balic; 
  • Gabriel Rinnerthaler; 
  • Richard Greil; 
  • Michael Gnant; 
  • Mark Clemons; 
  • Khaled El Emam

ABSTRACT

Background:

Insufficient accrual is a major challenge in clinical trials and can result in underpowered studies, as well as exposing study participants to toxicity and additional costs, with limited scientific benefit.

Objective:

Evaluate whether generative models can be used to simulate additional virtual patients to compensate for insufficient accrual in clinical trials.

Methods:

We performed a retrospective analysis using ten datasets from nine fully accrued, completed and published breast cancer trials. For each trial we removed the latest recruited patients, trained a generative model on the remaining patients, and simulated virtual patients to replace the removed ones using the generative model to augment the available data. We then replicated the published analysis on this augmented dataset to determine if the findings are the same. Four different generative models were evaluated: sequential synthesis with decision trees, Bayesian network, generative adversarial network, and a variational autoencoder. These generative models were compared to sampling with replacement (bootstrap) as a simple alternative. Replication of the published analysis utilized four metrics: decision agreement, estimate agreement, standardized difference, and confidence interval overlap.

Results:

All approaches struggle when the trial result is marginal, indicating a general sensitivity to that particular scenario. Otherwise, sequential synthesis performed well on the replication metrics for the removal of up to 40% of the last recruited patients (decision agreement: 88% to 100% across datasets, estimate agreement 100%, cannot reject standardized difference null hypothesis: 89% to 100%, and CI overlap: 0.8 to 0.92), and the Bayesian network performed relatively well on the smallest datasets. There was no evidence of a monotonic relationship in the estimated effect size with recruitment order across these studies. This suggests that patients recruited earlier in a trial are not systematically different than those recruited later, at least partially explaining why generative models trained on early data can effectively simulate patients recruited later in a trial.

Conclusions:

For a study with poor accrual, sequential synthesis is relatively effective and can enable the simulation of the full dataset had the study continued accruing patients. For the smaller datasets, a Bayesian network should be used. These results demonstrate the potential for generative models to rescue poorly accruing clinical trials. Clinical Trial: NCT02861859; NCT02721433; NCT00066573; NCT00009945; NCT02428114; NCT02816164; NCT02632435; NCT00295646; NCT03664687; NCT00127205


 Citation

Please cite as:

El Kababji S, Mitsakakis N, Jonker E, Beltran-Bless AA, Pond G, Vandermeer L, Radhakrishnan D, Mosquera L, Paterson A, Shepherd L, Chen B, Barlow W, Gralow J, Savard MF, Fesl C, Hlauschek D, Balic M, Rinnerthaler G, Greil R, Gnant M, Clemons M, El Emam K

Augmenting Insufficiently Accruing Oncology Clinical Trials Using Generative Models: Validation Study

J Med Internet Res 2025;27:e66821

DOI: 10.2196/66821

PMID: 40053790

PMCID: 11923467

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.