Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: JMIR Formative Research

Date Submitted: May 8, 2026
Open Peer Review Period: May 22, 2026 - Jul 17, 2026
(currently open for review)

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Graph-Constrained Skill Loading for Domain-Adapted Agentic AI in Clinical Trial Data Pipelines: A Controlled Evaluation

  • Jaime Yan

ABSTRACT

Background:

Clinical trial data pipelines require strict adherence to Clinical Data Interchange Standards Consortium (CDISC) regulatory standards, including the Study Data Tabulation Model (SDTM) and the Analysis Data Model (ADaM). General-purpose large language models (LLMs) frequently violate these requirements through terminology hallucinations, noncompliant variable mappings, and fabricated derivation logic. Standard domain adaptation techniques, including retrieval-augmented generation (RAG) and fine-tuning, do not encode the structured workflow dependencies inherent in layered regulatory pipelines. No prior work has systematically evaluated AI-assisted regulatory statistical programming with domain-specific knowledge injection constrained by workflow topology.

Objective:

This study aimed to evaluate whether graph-constrained skill loading—a mechanism that injects domain-specific regulatory knowledge into an agentic AI system based on workflow position within a directed acyclic graph (DAG)—can improve the regulatory compliance quality of AI-generated clinical data artifacts compared with an unconstrained baseline.

Methods:

The clinical trial evidence pipeline was modeled as a 45-node DAG spanning 7 procedural layers. An Adaptive Priority Scheduler selectively loads domain-specific skills (59 SKILL.md rule files containing CDISC dictionaries, derivation logic, and validation protocols) based on graph proximity within a fixed token budget. We conducted a controlled evaluation comparing 3 conditions—unbounded baseline, graph-constrained framework, and framework with distilled principles—across 10 regulatory tasks (n=30 runs per condition, N=90 total). Performance was assessed using automated code-density metrics and a blinded LLM-as-a-judge evaluation protocol with 2 independent panels.

Results:

Graph-constrained skill loading significantly improved overall output quality (+0.47 on a 5-point scale; 95% confidence interval [CI] 0.12–0.80; P=.004), regulatory structure compliance (+0.63; P<.001), and terminology precision (+0.50; P=.01). The LLM-based compliance evaluation corroborated these findings, with a compliance margin of +0.64 (framework: 4.34 vs baseline: 3.70) and substantial interpanel agreement (Cohen κ=0.76; Pearson r=0.95). The constraint mechanism incurred a modest 12% increase in token cost. An exploratory condition adding distilled global principles showed slight performance degradation (−0.13), suggesting attention saturation from redundant constraints.

Conclusions:

Graph-constrained skill loading produces statistically significant improvements in regulatory compliance quality for clinical trial data generation, with favorable cost-efficiency ratios compared with general-purpose agentic approaches. The consistent improvements across multiple assessment tiers and the identification of boundary limitations (eg, metadata synthesis tasks) provide a foundation for future validation with real-world clinical trial data and practicing domain experts. All code and evaluation materials are publicly available.


 Citation

Please cite as:

Yan J

Graph-Constrained Skill Loading for Domain-Adapted Agentic AI in Clinical Trial Data Pipelines: A Controlled Evaluation

JMIR Preprints. 08/05/2026:100769

DOI: 10.2196/preprints.100769

URL: https://preprints.jmir.org/preprint/100769

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.