Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Jan 8, 2025
Open Peer Review Period: Feb 17, 2025 - Apr 14, 2025
Date Accepted: Jun 17, 2025
(closed for review but you can still tweet)
Using GPT-4o for CAD-RADS feature extraction and categorization with free-text coronary CT Angiography reports
ABSTRACT
Background:
Despite the Coronary Artery Reporting and Data System (CAD-RADS) providing a standardized approach, radiologists continue to favor free-text reports. This preference creates significant challenges for data extraction and analysis in longitudinal studies, potentially limiting large-scale research and quality assessment initiatives.
Objective:
To evaluate the ability of the GPT-4o model to convert real-world coronary CT angiography (CCTA) free-text reports into structured data and automatically identify CAD-RADS categories and P Categories.
Methods:
This retrospective study analyzed CCTA reports from January 2024 and July 2024. A subset of 25 reports was used for prompt engineering to instruct the LLMs in extracting CAD-RADS categories, P Categories, the presence of myocardial bridges and non-calcified plaques. Reports were processed using the GPT-4o API and custom Python scripts. The ground truth was established by radiologist based on the CAD-RADS 2.0 guidelines. Model performance was assessed using accuracy, sensitivity, specificity, and F1 score. Intra-rater reliability was assessed using Cohen's Kappa coefficient.
Results:
Among 999 patients (median age 66 years, range 58-74; 650 males), CAD-RADS categorization showed accuracy of 0.98-1.00, sensitivity of 0.95-1.00, specificity of 0.98-1.00, and F1 score of 0.96-1.00. P Categories demonstrated accuracy of 0.97-1.00, sensitivity of 0.90-1.00, specificity of 0.98-1.00, and F1 score of 0.91-0.99. Myocardial bridge detection achieved 0.98 accuracy, and calcified plaque detection showed 0.98 accuracy. Cohen's Kappa values for all classifications exceeded 0.98.
Conclusions:
The GPT-4o model efficiently and accurately converts CCTA free-text reports into structured data, excelling in CAD-RADS classification, plaque burden assessment, and detection of myocardial bridges and calcified plaques.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.