JMIR Preprints #55127: RoBuster: A Corpus Annotated with Risk of Bias Text Spans in Randomized Controlled Trials

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

RoBuster: A Corpus Annotated with Risk of Bias Text Spans in Randomized Controlled Trials

Anjani Dhrangadhariya;
Roger Hilfiker;
Karl Martin Sattelmayer;
Nona Naderi;
Katia Giacomino;
Rahel Caliesch;
Julian Higgins;
Stéphane Marchand-Maillet;
Henning Müller

ABSTRACT

Background:

Risk of bias (RoB) assessment of randomized clinical trials (RCTs) is vital to answering systematic review questions accurately. Manual RoB assessment for hundreds of RCTs is a cognitively demanding and lengthy process. Automation has the potential to assist reviewers in rapidly identifying text descriptions in RCTs that indicate potential risks of bias. However, no RoB text span annotated corpus could be used to fine-tune or evaluate large language models (LLMs), and there are no established guidelines for annotating the RoB spans in RCTs.

Objective:

The revised Cochrane RoB Assessment 2 (RoB 2) tool provides comprehensive guidelines for RoB assessment; however, due to the inherent subjectivity of this tool, it cannot be directly used as RoB annotation guidelines. Our objective was to develop precise RoB text span annotation instructions that could address this subjectivity and thus aid the corpus annotation.

Methods:

We leveraged RoB 2 guidelines to develop visual instructional placards that serve as text annotation guidelines for RoB spans and risk judgments. Expert annotators employed these visual placards to annotate a dataset named RoBuster, consisting of 41 full-text RCTs from the domains of physiotherapy and rehabilitation. We report inter-annotator agreement (IAA) between two expert annotators for text span annotations before and after applying visual instructions on a subset (9 out of 41) of RoBuster. We also provide IAA on bias risk judgments using Cohen's Kappa. Moreover, we utilized a portion of RoBuster (10 out of 41) to evaluate an LLM using a straightforward evaluation framework. This evaluation aimed to gauge the performance of LLM (here GPT 3.5) in the challenging task of RoB span extraction and demonstrate the utility of this corpus using a straightforward evaluation framework.

Results:

We present a corpus of 41 RCTs with fine-grained text span annotations comprising more than 28,427 tokens belonging to 22 RoB classes. The IAA at the text span level calculated using the F1 measure varies from 0% to 90%, while Cohen's kappa for risk judgments ranges between -0.235 and 1.0. Employing visual instructions for annotation increases the IAA by more than 17 percent points. LLM (GPT-3.5) shows promising but varied observed agreements with the expert annotation across the different bias questions.

Conclusions:

Despite having comprehensive bias assessment guidelines and visual instructional placards, RoB annotation remains a complex task. Utilizing visual placards for bias assessment and annotation enhances IAA compared to cases where visual placards are absent; however, text annotation remains challenging for the subjective questions and the questions for which annotation data is unavailable in RCTs. Similarly, while GPT-3.5 demonstrates effectiveness, its accuracy diminishes with more subjective RoB questions and low information availability.

Citation

Please cite as:

Dhrangadhariya A, Hilfiker R, Sattelmayer KM, Naderi N, Giacomino K, Caliesch R, Higgins J, Marchand-Maillet S, Müller H

RoBuster—Corpus Annotated With Risk of Bias Text Spans in Randomized Controlled Trials in Physiotherapy and Rehabilitation: Corpus Development and Annotation Study

JMIR Form Res 2026;10:e55127

DOI: 10.2196/55127

PMID: 28287551

PMCID: 13120535

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Formative Research

Date Submitted: Dec 5, 2023

Open Peer Review Period: Dec 5, 2023 - Jan 31, 2024

Date Accepted: Jan 16, 2026

(closed for review but you can still tweet)

RoBuster: A Corpus Annotated with Risk of Bias Text Spans in Randomized Controlled Trials

ABSTRACT

Citation

Copyright