Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Mar 7, 2025
Date Accepted: Oct 7, 2025

The final, peer-reviewed published version of this preprint can be found here:

Large Language Model Versus Manual Review for Clinical Data Curation in Breast Cancer: Retrospective Comparative Study

Kang YJ, Lee H, Yi JP, Kim H, Yoon CI, Baek JM, Kim Ys, Jeon YW, Rhu J, Lim SH, Choi H, Oh SJ

Large Language Model Versus Manual Review for Clinical Data Curation in Breast Cancer: Retrospective Comparative Study

JMIR Med Inform 2025;13:e73605

DOI: 10.2196/73605

PMID: 41197113

PMCID: 12599480

Large Language Model vs Manual Review for Clinical Data Curation in Breast Cancer: A Retrospective Comparative Study

  • Young Joon Kang; 
  • Hocheol Lee; 
  • Jae Pak Yi; 
  • Hyobin Kim; 
  • Chang Ik Yoon; 
  • Jong Min Baek; 
  • Yong-seok Kim; 
  • Ye Won Jeon; 
  • Jiyoung Rhu; 
  • Su Hyun Lim; 
  • Hoon Choi; 
  • Se Jeong Oh

ABSTRACT

Background:

Manual review of electronic health records for clinical research is labor-and time-intensive, and prone to reviewer-dependent variations. Large language models (LLMs) offer the potential to automate the extraction of clinical data; however, few studies have been published on their feasibility in the field of surgical oncology.

Objective:

To evaluate the feasibility and accuracy of LLM-based processing compared with manual physician review for the extraction of clinical data from breast cancer records.

Methods:

In this retrospective, comparative study, we analyzed breast cancer records from five academic hospitals from January to December 2019. Two independent cohorts were compared: manual physician review (n=1,366) and LLM-based processing using Claude 3.5 Sonnet (n=1,734) groups. Primary outcomes included missing value rates of key clinical variables, accuracy of data extraction, and concordance between the two cohorts. Secondary outcomes included comparison with national registry data, processing time, and resource utilization.

Results:

The groups yielded comparable results for most clinical parameters. The LLM group yielded better documentation of lymph node assessment (91.2% vs. 78.5%) but had a larger proportion of missing data for cancer staging (12.2% vs. 3.1%). The groups had a similar pattern of breast-conserving surgery (63.5% vs. 63.9%). The LLM achieved 90.8% accuracy in the validation analysis and required significantly less processing time (12 days vs. 7 months) and fewer physicians (two vs. five). The LLM group’s stage distribution aligned better with the national registry data than the manual-review group (Cramér’s V = 0.03 vs. 0.07), and it captured more survival events (41 vs. 11; P = .002).

Conclusions:

LLM-based processing demonstrated comparable effectiveness to manual review by physicians, while significantly reducing processing time and resource utilization. Despite its limitations in integrated assessments, this approach showed potential for efficient clinical data extraction in oncology research. Clinical Trial: NA


 Citation

Please cite as:

Kang YJ, Lee H, Yi JP, Kim H, Yoon CI, Baek JM, Kim Ys, Jeon YW, Rhu J, Lim SH, Choi H, Oh SJ

Large Language Model Versus Manual Review for Clinical Data Curation in Breast Cancer: Retrospective Comparative Study

JMIR Med Inform 2025;13:e73605

DOI: 10.2196/73605

PMID: 41197113

PMCID: 12599480

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.