JMIR Preprints #96901: Synthetic Content Validation of Pediatric Trust Instruments Using Persona-Driven Large Language Models

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Synthetic Content Validation of Pediatric Trust Instruments Using Persona-Driven Large Language Models

Ella Boone;
Katya Loban;
Carmel Daskalo;
Suganthi Rajasegaran;
Elena Guadagno;
Jeannie Haggerty;
Sylvie Lambert;
Dan Poenaru

ABSTRACT

Background:

Large language models (LLMs) could streamline healthcare instrument validation by serving as scalable, systematic expert panels and qualitative researcher surrogates. This is particularly relevant as traditional instrument development is time- and resource-intensive. Currently, there is a significant gap in validated trust instruments for pediatric emergency and surgical contexts. Such tools are key because trust is foundational to the patient family-physician relationship and is associated with improved care-seeking and treatment adherence.

Objective:

This study had two objectives: (1) to develop and validate new trust instruments through a synthetic instrument validation (SIV) approach that integrates human and LLM capabilities, and (2) to evaluate appropriate use cases for LLMs in psychometric assessment.

Methods:

Two new trust instruments were developed, one for patient families and one for physicians. In phase one, the instruments underwent a two-stage content validation process using parallel synthetic and human expert panels (16 synthetic personas and 10 human experts across validation stages). Synthetic panels consisted of three persona-prompted LLMs (Claude Sonnet 4, GPT-5, Grok 4), with human panels serving as comparators. The Scale-Content Validity Index (S-CVI) and Fleiss' kappa (????) acceptance thresholds were set at ≥0.80. In phase two, LLM performance in quantitative research tasks was evaluated. The patient family instrument underwent Flesch-Kincaid assessment, and both instruments underwent cosine similarity analyses using three parallel methods: algorithmic, LLM-instructed, and LLM-derived.

Results:

In phase one, human–synthetic expert panels demonstrated substantial inter-rater reliability across both instruments. Fleiss' ???? values for dimensional validation were 0.84 (95% CI [0.72, 0.96]) for the patient family and 0.87 (95% CI [0.72, 1.00]) for the physician. For contextual validation, ???? values were 0.83 (95% CI [0.73, 0.93]) and 0.88 (95% CI [0.80, 0.96]), respectively. All instrument sections exceeded S-CVI ≥0.80 thresholds across both stages. Phase two Flesch-Kincaid metrics converged across all three methods (grade level 8.1 ± 1.1; readability score 60.1 ± 5.6), meeting accessibility standards and demonstrating methodological similarity. In contrast, cosine similarity analyses revealed significant LLM quantitative limitations, necessitating reliance on the algorithmic method alone, achieving a maximum cosine similarity of 0.83, indicating acceptable item distinctiveness overall.

Conclusions:

Persona-prompted LLMs effectively performed subjective psychometric assessments and reduced timelines from months to weeks, but showed limitations in quantitative computations. This suggests that LLMs currently excel in qualitative assessments, while falling short in rule-based and deterministic computations. These findings help establish task-dependent boundaries for LLM integration in psychometric research, necessitating selective human-LLM collaboration. This hybrid SIV framework shows potential to accelerate healthcare instrument development while maintaining validation rigor.

Citation

Please cite as:

Boone E, Loban K, Daskalo C, Rajasegaran S, Guadagno E, Haggerty J, Lambert S, Poenaru D

Synthetic Content Validation of Pediatric Trust Instruments Using Persona-Driven Large Language Models

JMIR Preprints. 01/04/2026:96901

DOI: 10.2196/preprints.96901

URL: https://preprints.jmir.org/preprint/96901

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Currently submitted to: JMIR Formative Research

Date Submitted: Apr 1, 2026

Open Peer Review Period: Apr 15, 2026 - Jun 10, 2026

(currently open for review)

Synthetic Content Validation of Pediatric Trust Instruments Using Persona-Driven Large Language Models

ABSTRACT

Citation

Copyright