Currently submitted to: JMIR Formative Research
Date Submitted: May 8, 2026
Open Peer Review Period: May 22, 2026 - Jul 17, 2026
(currently open for review)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Graph-Constrained Skill Loading for Domain-Adapted Agentic AI in Clinical Trial Data Pipelines: A Controlled Evaluation
ABSTRACT
Background:
Clinical trial data pipelines require strict adherence to Clinical Data Interchange Standards Consortium (CDISC) regulatory standards, including the Study Data Tabulation Model (SDTM) and the Analysis Data Model (ADaM). General-purpose large language models (LLMs) frequently violate these requirements through terminology hallucinations, noncompliant variable mappings, and fabricated derivation logic. Standard domain adaptation techniques, including retrieval-augmented generation (RAG) and fine-tuning, do not encode the structured workflow dependencies inherent in layered regulatory pipelines. No prior work has systematically evaluated AI-assisted regulatory statistical programming with domain-specific knowledge injection constrained by workflow topology.
Objective:
This study aimed to evaluate whether graph-constrained skill loading—a mechanism that injects domain-specific regulatory knowledge into an agentic AI system based on workflow position within a directed acyclic graph (DAG)—can improve the regulatory compliance quality of AI-generated clinical data artifacts compared with an unconstrained baseline.
Methods:
The clinical trial evidence pipeline was modeled as a 45-node DAG spanning 7 procedural layers. An Adaptive Priority Scheduler selectively loads domain-specific skills (59 SKILL.md rule files containing CDISC dictionaries, derivation logic, and validation protocols) based on graph proximity within a fixed token budget. We conducted a controlled evaluation comparing 3 conditions—unbounded baseline, graph-constrained framework, and framework with distilled principles—across 10 regulatory tasks (n=30 runs per condition, N=90 total). Performance was assessed using automated code-density metrics and a blinded LLM-as-a-judge evaluation protocol with 2 independent panels.
Results:
Graph-constrained skill loading significantly improved overall output quality (+0.47 on a 5-point scale; 95% confidence interval [CI] 0.12–0.80; P=.004), regulatory structure compliance (+0.63; P<.001), and terminology precision (+0.50; P=.01). The LLM-based compliance evaluation corroborated these findings, with a compliance margin of +0.64 (framework: 4.34 vs baseline: 3.70) and substantial interpanel agreement (Cohen κ=0.76; Pearson r=0.95). The constraint mechanism incurred a modest 12% increase in token cost. An exploratory condition adding distilled global principles showed slight performance degradation (−0.13), suggesting attention saturation from redundant constraints.
Conclusions:
Graph-constrained skill loading produces statistically significant improvements in regulatory compliance quality for clinical trial data generation, with favorable cost-efficiency ratios compared with general-purpose agentic approaches. The consistent improvements across multiple assessment tiers and the identification of boundary limitations (eg, metadata synthesis tasks) provide a foundation for future validation with real-world clinical trial data and practicing domain experts. All code and evaluation materials are publicly available.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.