JMIR Preprints #73605: Large Language Model vs Manual Review for Clinical Data Curation in Breast Cancer: A Retrospective Comparative Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Large Language Model vs Manual Review for Clinical Data Curation in Breast Cancer: A Retrospective Comparative Study

Young Joon Kang;
Hocheol Lee;
Jae Pak Yi;
Hyobin Kim;
Chang Ik Yoon;
Jong Min Baek;
Yong-seok Kim;
Ye Won Jeon;
Jiyoung Rhu;
Su Hyun Lim;
Hoon Choi;
Se Jeong Oh

ABSTRACT

Background:

Manual review of electronic health records for clinical research is labor-and time-intensive, and prone to reviewer-dependent variations. Large language models (LLMs) offer the potential to automate the extraction of clinical data; however, few studies have been published on their feasibility in the field of surgical oncology.

Objective:

To evaluate the feasibility and accuracy of LLM-based processing compared with manual physician review for the extraction of clinical data from breast cancer records.

Methods:

In this retrospective, comparative study, we analyzed breast cancer records from five academic hospitals from January to December 2019. Two independent cohorts were compared: manual physician review (n=1,366) and LLM-based processing using Claude 3.5 Sonnet (n=1,734) groups. Primary outcomes included missing value rates of key clinical variables, accuracy of data extraction, and concordance between the two cohorts. Secondary outcomes included comparison with national registry data, processing time, and resource utilization.

Results:

The groups yielded comparable results for most clinical parameters. The LLM group yielded better documentation of lymph node assessment (91.2% vs. 78.5%) but had a larger proportion of missing data for cancer staging (12.2% vs. 3.1%). The groups had a similar pattern of breast-conserving surgery (63.5% vs. 63.9%). The LLM achieved 90.8% accuracy in the validation analysis and required significantly less processing time (12 days vs. 7 months) and fewer physicians (two vs. five). The LLM group’s stage distribution aligned better with the national registry data than the manual-review group (Cramér’s V = 0.03 vs. 0.07), and it captured more survival events (41 vs. 11; P = .002).

Conclusions:

LLM-based processing demonstrated comparable effectiveness to manual review by physicians, while significantly reducing processing time and resource utilization. Despite its limitations in integrated assessments, this approach showed potential for efficient clinical data extraction in oncology research. Clinical Trial: NA

Citation

Please cite as:

Kang YJ, Lee H, Yi JP, Kim H, Yoon CI, Baek JM, Kim Ys, Jeon YW, Rhu J, Lim SH, Choi H, Oh SJ

Large Language Model Versus Manual Review for Clinical Data Curation in Breast Cancer: Retrospective Comparative Study

JMIR Med Inform 2025;13:e73605

DOI: 10.2196/73605

PMID: 41197113

PMCID: 12599480

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Mar 7, 2025

Date Accepted: Oct 7, 2025

Large Language Model vs Manual Review for Clinical Data Curation in Breast Cancer: A Retrospective Comparative Study

ABSTRACT

Citation

Copyright