Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Jun 12, 2025
Open Peer Review Period: Jun 12, 2025 - Aug 7, 2025
Date Accepted: Oct 23, 2025
(closed for review but you can still tweet)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Enabling Just-in-Time Clinical Oncology Analysis: Large Language Models as Direct Analytical Engines for Unstructured Data
ABSTRACT
Background:
Traditional cancer registries, limited by labor-intensive manual data abstraction and rigid, predefined schemas, often hinder timely and comprehensive oncology research. While Large Language Models (LLMs) have shown promise in automating data extraction, their potential to perform direct, just-in-time (JIT) analysis on unstructured clinical narratives – potentially bypassing intermediate structured databases for many analytical tasks – remains largely unexplored.
Objective:
This study aimed to evaluate whether a state-of-the-art LLM (Gemini 2.5 Pro) can enable a JIT clinical oncology analysis paradigm by: 1) performing high-fidelity multiparameter data extraction, 2) answering complex clinical queries directly from raw text, 3) automating multi-step survival analyses including executable code generation, and 4) generating novel, clinically plausible hypotheses from free-text documentation.
Methods:
A synthetic dataset of 240 unstructured medical reports from stage IV non-small cell lung cancer (NSCLC) patients, embedding 14 predefined clinical variables, was used. Gemini 2.5 Pro was assessed on the four core JIT capabilities. Performance was measured by: extraction accuracy (compared to human annotation on n=40 reports and across the full n=240 dataset), numerical deviation for direct question answering (n=40 to 240 letters, 5 questions), log-rank concordance for LLM-generated vs. ground-truth Kaplan-Meier survival analyses (OS and PFS from n=80 and n=160 reports), and clinical plausibility of LLM-generated hypotheses from the full dataset (n=240 reports).
Results:
For multiparameter extraction from n=40 reports, the LLM achieved >99% average accuracy, comparable to a human annotator (Friedman test, p=0.139), but in significantly less time (LLM: 3.7 minutes vs. Human: 133.8 minutes). Across the full 240-report dataset, LLM multiparameter extraction maintained >98% accuracy for most variables. The LLM answered multi-conditional clinical queries directly from raw text with a relative deviation typically below 1% and rarely exceeding 1.5%, even with up to 240 letters. Crucially, it autonomously performed end-to-end survival analysis, generating text-to-R-code that produced Kaplan-Meier curves statistically indistinguishable from ground truth for OS (log-rank p=0.99) and PFS (log-rank p=0.89). Subgroup PFS analysis (driver mutation vs. wild type, n=160) was also accurately replicated (log-rank p < 0.0001), with comparable median PFS (e.g., Driver: LLM 26.0 vs. Ground Truth 28.0 months). Furthermore, the LLM generated clinically plausible hypotheses regarding biomarker–outcome associations and toxicities without specific prompting.
Conclusions:
LLMs can enable a paradigm shift towards dynamic, just-in-time clinical analysis and knowledge discovery directly from narrative data, offering a powerful alternative or complement to traditional registry architectures for many research and analytical needs. This suggests a future of AI-assisted, “living” oncology ecosystems capable of supporting timely, scalable, and hypothesis-driven research. Rigorous validation on real-world, multi-institutional datasets, with careful attention to ethics and data privacy, is essential before clinical implementation.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.