JMIR Preprints #74426: GPT-4o's Effectiveness in ECG Image Interpretation for Cardiac Diagnostics: Evaluation Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

GPT-4o's Effectiveness in ECG Image Interpretation for Cardiac Diagnostics: Evaluation Study

Haya Engelstein;
Roni Ramon Gonen;
Avi Sabbag;
Eyal Klang;
Karin Sudri;
Michal Cohen-Shelly;
Israel Barbash

ABSTRACT

Background:

Recent progress has demonstrated the potential of deep learning models in analyzing ECG pathologies. However, this method is intricate, expensive to develop, and designed for specific purposes. Large language models show promise in medical image interpretation, yet their effectiveness in ECG analysis remains understudied. GPT-4o, a multimodal AI model, capable of processing images and text without task-specific training, may offer an accessible alternative.

Objective:

This study evaluates GPT-4o's effectiveness in interpreting 12-lead ECGs, assessing classification accuracy, and exploring methods to enhance its performance.

Methods:

Six common ECG diagnoses were evaluated: Normal ECG, STEMI, AF, RBBB, LBBB, and paced rhythm, with 30 Normal ECGs and 10 of each abnormal pattern, totaling 80 cases (n=80). De-identified ECGs were analyzed using OpenAI’s GPT-4o. Our study employed both zero-shot and few-shot learning methodologies to investigate three main scenarios: (1) ECG image recognition, (2) binary classification of normal versus abnormal ECGs, and (3) multiclass classification into six categories.

Results:

The model excelled in recognizing ECG images, achieving an accuracy of 100%. In the classification of normal/abnormal ECG cases, the Few-Shot learning approach improved GPT-4o’s accuracy by 27%, reaching 80%. However, multiclass classification for a specific pathology remained limited, achieving only 41% accuracy.

Conclusions:

GPT-4o effectively differentiates normal from abnormal ECGs, suggesting its potential as an accessible AI-assisted triage tool. Although limited in diagnosing specific cardiac conditions, GPT-4o’s capability to interpret ECG images without specialized training highlights its potential for preliminary ECG interpretation in clinical and remote settings.

Citation

Please cite as:

Engelstein H, Ramon Gonen R, Sabbag A, Klang E, Sudri K, Cohen-Shelly M, Barbash I

Effectiveness of the GPT-4o Model in Interpreting Electrocardiogram Images for Cardiac Diagnostics: Diagnostic Accuracy Study

JMIR AI 2025;4:e74426

DOI: 10.2196/74426

PMID: 40845836

PMCID: 12375907

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR AI

Date Submitted: Mar 24, 2025

Date Accepted: Jul 9, 2025

GPT-4o's Effectiveness in ECG Image Interpretation for Cardiac Diagnostics: Evaluation Study

ABSTRACT

Citation