Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: JMIR Medical Informatics

Date Submitted: Nov 7, 2025
Open Peer Review Period: Nov 25, 2025 - Jan 20, 2026
(closed for review but you can still tweet)

NOTE: This is an unreviewed Preprint

Warning: This is a unreviewed preprint (What is a preprint?). Readers are warned that the document has not been peer-reviewed by expert/patient reviewers or an academic editor, may contain misleading claims, and is likely to undergo changes before final publication, if accepted, or may have been rejected/withdrawn (a note "no longer under consideration" will appear above).

Peer review me: Readers with interest and expertise are encouraged to sign up as peer-reviewer, if the paper is within an open peer-review period (in this case, a "Peer Review Me" button to sign up as reviewer is displayed above). All preprints currently open for review are listed here. Outside of the formal open peer-review period we encourage you to tweet about the preprint.

Citation: Please cite this preprint only for review purposes or for grant applications and CVs (if you are the author).

Final version: If our system detects a final peer-reviewed "version of record" (VoR) published in any journal, a link to that VoR will appear below. Readers are then encourage to cite the VoR instead of this preprint.

Settings: If you are the author, you can login and change the preprint display settings, but the preprint URL/DOI is supposed to be stable and citable, so it should not be removed once posted.

Submit: To post your own preprint, simply submit to any JMIR journal, and choose the appropriate settings to expose your submitted version as preprint.

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

ECG-R1: A Multi-modal Vision-Language Model with Reinforcement Learning for Differentiating Ischemic from Non-ischemic T-wave Inversion

  • Yunzhang Cheng; 
  • Zhongkai Wang; 
  • Wen Zhang; 
  • Qin Zhang; 
  • Mingwei Zhang; 
  • Songbin Cai; 
  • Tianyi Zhang

ABSTRACT

Background:

The differentiation of ischemic from non-ischemic T-wave inversion (TWI) on electrocardiograms (ECGs) is a critical diagnostic challenge in cardiology. The non-specific nature of TWI leads to high false-positive rates, resulting in unnecessary, costly, and risky invasive procedures for patients. Existing deep learning models are often limited by being single-modality "black boxes".

Objective:

The objective of this study is to develop a novel diagnostic framework designed to address the critical clinical challenge of accurately differentiating ischemic from non-ischemic TWI. By utilizing a multi-modal Vision-Language Model trained with a Reinforcement Learning (RL) paradigm, this study aims to improve diagnostic accuracy and provide interpretable reasoning.

Methods:

We develop ECG-R1, a multi-modal framework using the Qwen2-VL-2B Vision-Language Model to analyze both ECG waveform images and associated clinical text. Instead of SFT, the model is trained using a RL paradigm with the Group Relative Policy Optimization (GRPO) algorithm. The model is trained to generate a structured output containing an explicit reasoning trace and a final "Yes" or "No" answer. A two-component, rule-based reward function is designed to assess both format adherence and diagnostic accuracy. Performance is compared against strong Supervised Fine-Tuning (SFT) baselines.

Results:

On a multi-modal dataset of 12,917 cases with TWI, our GRPO model achieves an average accuracy of 74.07%, demonstrating strong generalization with 72.93% accuracy in cross-hospital validation. This result is an improvement of ~24 % over the ~50% diagnostic accuracy of clinicians and 8.2% higher than the best SFT baseline, using ~71% fewer parameters.

Conclusions:

The RL-based ECG-R1 framework successfully differentiates ischemic from non-ischemic TWI and demonstrates significantly better generalization than standard SFT methods. By enhancing diagnostic accuracy and providing interpretable reasoning, this approach offers a more robust and trustworthy tool to support clinical decision-making in cardiology.


 Citation

Please cite as:

Cheng Y, Wang Z, Zhang W, Zhang Q, Zhang M, Cai S, Zhang T

ECG-R1: A Multi-modal Vision-Language Model with Reinforcement Learning for Differentiating Ischemic from Non-ischemic T-wave Inversion

JMIR Preprints. 07/11/2025:87227

DOI: 10.2196/preprints.87227

URL: https://preprints.jmir.org/preprint/87227

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.