Currently submitted to: JMIR Bioinformatics and Biotechnology
Date Submitted: Jun 15, 2026
Open Peer Review Period: Jun 18, 2026 - Aug 13, 2026
(currently open for review)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Selection-bias-aware multimodal survival learning for molecular pathological epidemiology: A stabilized IPW–Cox method with a differentiable etiologic-heterogeneity head
ABSTRACT
Deep survival models for tumor-tissue and molecular studies are trained only on patients whose specimens are collected, archived, retrieved, assayed, and pass quality control—a covariatedependent draw from the incident-case population that biases the learned relationship toward the analyzable sample rather than the population. We present SBASURV, an architecture-agnostic survival head that replaces the Cox partial likelihood with a stabilized inverse-probability-weighted Cox objective, weighting both the event term and the risk-set denominator, and implements the Lunn–McNeil duplication method as a differentiable module emitting a Wald statistic for etiologic heterogeneity. On five public datasets with induced covariate-dependent selection the head improves held-out concordance over a naive neural head, with weight diagnostics making the bias–variance regime explicit. Deployed on real colorectal cancer across clinical, omics, and whole-slide histopathology with real availability weights, clinical+omics fusion reaches held-out Harrell C 0.68; availability weighting preserves discrimination while shifting the estimand, and a heterogeneity test significant under a mutation-burden proxy subtype is null under a genuine methylation-defined subtype—a caution that subtype conclusions depend on how subtypes are operationalized.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.