Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
From Data-Driven Nomograms to Knowledge-Driven Clinical AI: Comparative Validation of Bayesian and LLM-Based Models in Preoperative Risk Prediction
ABSTRACT
Background:
Clinical prediction tools are commonly derived from retrospective datasets using statistical learning approaches. Although effective, these data-driven nomograms may be population-specific, static, and only partially aligned with causal clinical reasoning. Recent advances in artificial intelligence enable knowledge-driven approaches in which expert assumptions, probabilistic structures, and domain reasoning contribute directly to model construction.
Objective:
To compare the performance of two knowledge-driven AI models with established data-driven nomograms for preoperative prediction of lymph node invasion (LNI) in localized prostate cancer.
Methods:
A retrospective data set of 229 consecutive patients with clinically localized prostate cancer (cT2 on examination and MRI) treated with radical prostatectomy and extended pelvic lymph node dissection was submitted as use case. Histopathological LNI was the reference standard. Predictors included PSA density, MRI extracapsular extension, biopsy ISUP grade group, and maximum tumor diameter on MRI. Three conventional models (Briganti, Yale, Roach) were compared with two knowledge-driven systems: (1) an LLM-assisted logistic equation generated from predefined clinical constraints, and (2) a Bayesian network parameterized through structured expert/AI elicitation. Discrimination, threshold metrics, predictive values, and decision utility were assessed.
Results:
The LLM-assisted logistic model (AUC 0.697) and Bayesian network (AUC 0.689) showed close performance to the Briganti model which achieved the highest discrimination (AUC 0.721). The Bayesian model achieved the highest Youden index (0.346) and a strong clinical utility, indicating the best balance between sensitivity and specificity. Negative predictive values exceeded 0.89 for all models
Conclusions:
Knowledge-driven AI models achieved performance comparable to established nomograms while offering interpretability and probabilistic reasoning. These findings support prospective evaluation of hybrid data- and knowledge-driven clinical decision-support systems.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.