JMIR Preprints #70450: LLM-Assisted Evidence Synthesis: Assessing Risk of Bias of Randomized Controlled Trials with RoB 2

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

LLM-Assisted Evidence Synthesis: Assessing Risk of Bias of Randomized Controlled Trials with RoB 2

Jiajie Huang;
Honghao Lai;
Weilong Zhao;
Danni Xia;
Chunyang Bai;
Jianing Liu;
Jiayi Liu;
Bei Pan;
Jinhui Tian;
Long Ge

ABSTRACT

Background:

The revised version of the risk of bias tool (RoB 2) overcomes some limitations compared with the original version, but concurrently introduces challenges in its application. Large language models (LLMs) may potentially assist the utilization of RoB 2. However, the exact methods and reliability remain uncertain

Objective:

This feasibility study aims investigate the capability of large language models in assessing the ROB of RCTs with RoB 2

Methods:

We systematically searched Cochrane reviews utilizing the RoB 2. Cochrane reviews were classified based on interested in adherence to intervention or assignment to intervention. From each category, 23 RCTs were randomly selected. The RoB 2 judgments reported the Cochrane reviews were used as the external validation standard. Three experienced reviewers were recruited to assess risk of bias of selected 46 RCTs using RoB 2. The reviewer judgments of six randomized controlled trials were selected to develop and optimize the prompt for the LLMs. The remaining 40 trials were used to establish the internal validation standard. Accuracy rate was calculated to reflect accuracy, both domain and signaling question; consistent assessment rate and Cohen’ κ were calculated to gauge consistency; and assessment time was calculated to measure efficiency.

Results:

Compared to Cochrane reviews, the LLMs' judgments demonstrated accuracy rates of 57.5% and 70% for Overall (assignment) and Overall (adhering), respectively. When compared to reviewer judgments, the LLMs' accuracy rates for Overall (assignment) and Overall (adhering) were 65% and 70.0%. The average accuracy rates for the remaining six domains were 65.2% (95% CI, 57.6%-72.7%) and 74.2% (95% CI, 64.7%-83.9%) when compared to Cochrane reviews and reviewers. The average accuracy rate in signaling level was 83.2% (95%CI: 77.5%-88.9%), consistent assessment rate is 85.2% (95% CI: 85.15%-88.79%). Compared to Reviewers, the LLMs conducted assessments 29.6 minutes (95% CI: 25.6-33.6) faster.

Conclusions:

LLMs were capable of rapidly assessing the risk of bias in RCTs using RoB 2, and exhibit a comparatively high level of accuracy. This suggests the potential utility of employing LLMs as adjunctive tools in the systematic review process.

Citation

Please cite as:

Huang J, Lai H, Zhao W, Xia D, Bai C, Liu J, Liu J, Pan B, Tian J, Ge L

Large Language Model–Assisted Risk-of-Bias Assessment in Randomized Controlled Trials Using the Revised Risk-of-Bias Tool: Evaluation Study

J Med Internet Res 2025;27:e70450

DOI: 10.2196/70450

PMID: 40554779

PMCID: 12238788

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Dec 22, 2024

Date Accepted: May 5, 2025

LLM-Assisted Evidence Synthesis: Assessing Risk of Bias of Randomized Controlled Trials with RoB 2

ABSTRACT

Citation

Copyright