JMIR Preprints #68666: Evaluating a customised version of ChatGPT for systematic review data extraction in health research: a pilot study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Evaluating a customised version of ChatGPT for systematic review data extraction in health research: a pilot study

Jayden Sercombe;
Zachary Bryant;
Jack Wilson

ABSTRACT

Background:

Systematic reviews are essential for synthesising research in health sciences, yet they are resource-intensive and prone to human error. Recent advancements in artificial intelligence (AI), specifically Large Language Models (LLMs) like ChatGPT, may streamline this process.

Objective:

This study aims to develop and evaluate a custom Generative Pre-Training Transformer (GPT), named Systematic Review Extractor Pro, for automating the data extraction phase of systematic reviews in health research.

Methods:

OpenAI's GPT Builder was used to create a GPT tailored to extract information from academic manuscripts. A sample of 20 studies across two distinct systematic reviews was used to evaluate the GPT's performance in extraction. Agreement rates between the GPT outputs and human reviewers were calculated for each study subsection.

Results:

The GPT demonstrated high overall agreement rates with human reviewers, achieving 91.45% for review 1 and 89.31% for review 2. It was particularly accurate in extracting study (review 1: 95.25; review 2: 90.83%) and participant (review 1: 95.03%; review 2: 90.00%) characteristics, with lower performance observed in more complex areas such as methodological characteristics (87.07%) and statistical results (77.50%).

Conclusions:

As AI is a rapidly evolving, the technology may significantly enhance systematic review practices by improving efficiency and reducing human errors. The tool in the current study has been made open access.

Citation

Please cite as:

Sercombe J, Bryant Z, Wilson J

Evaluating a Customized Version of ChatGPT for Systematic Review Data Extraction in Health Research: Development and Usability Study

JMIR Form Res 2025;9:e68666

DOI: 10.2196/68666

PMID: 40789147

PMCID: 12338963

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Formative Research

Date Submitted: Nov 11, 2024

Date Accepted: Jul 3, 2025

Evaluating a customised version of ChatGPT for systematic review data extraction in health research: a pilot study

ABSTRACT

Citation

Copyright