Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently accepted at: JMIR Medical Informatics

Date Submitted: Sep 11, 2025
Open Peer Review Period: Sep 25, 2025 - Nov 20, 2025
Date Accepted: Dec 22, 2025
(closed for review but you can still tweet)

This paper has been accepted and is currently in production.

It will appear shortly on 10.2196/83318

The final accepted version (not copyedited yet) is in this tab.

ARTIFICIAL INTELLIGENCE MODELS FOR PREDICTING TRIAGE IN EMERGENCY DEPARTMENTS: A 7-MONTH RETROSPECTIVE COMPARATIVE SARTIFICIAL INTELLIGENCE MODELS FOR PREDICTING TRIAGE IN EMERGENCY DEPARTMENTS: A 7-MONTH RETROSPECTIVE COMPARATIVE STUDY OF NLP, LLM, AND JEPA ARCHITECTURES

  • Edouard Lansiaux; 
  • Ramy Azzouz; 
  • Emmanuel Chazard; 
  • Amélie Vromant; 
  • Eric Wiel

ABSTRACT

Background Triage errors, including undertriage and overtriage, remain persistent challenges in emergency departments (EDs). With increasing patient influx and staff shortages, the integration of artificial intelligence (AI) into triage protocols has gained growing attention. Objective This study aimed to compare the performance of three AI models—Natural Language Processing (NLP), Large Language Models (LLM), and Joint Embedding Predictive Architecture (JEPA)—for predicting triage outcomes against the FRENCH scale and clinical practice. Methods We conducted a retrospective analysis of a prospectively recruited cohort of adult patients triaged over a 7-month period at Roger Salengro Hospital ED (Lille, France). Three AI models were trained and validated: (1) TRIAGEMASTER (NLP), (2) URGENTIAPARSE (LLM), and (3) EMERGINET (JEPA). Data included demographic details, verbatim chief complaints, vital signs, and triage outcomes based on the FRENCH scale and GEMSA coding. The primary outcome was concordance of AI-predicted triage levels with the French gold standard, assessed with F1-Score, Weighted Kappa, Spearman correlation, MAE, RMSE, and AUC-ROC. Results The LLM model (URGENTIAPARSE) achieved the highest accuracy (composite score: 2.514) compared with JEPA (EMERGINET, 0.438), NLP (TRIAGEMASTER, –3.511), and nurse triage (–4.343). Performance indicators confirmed this superiority: F1-Score and AUC-ROC were 0.900 and 0.879 for URGENTIAPARSE, versus 0.731 and 0.686 for EMERGINET, 0.618 and 0.642 for TRIAGEMASTER, and 0.303 and 0.776 for nurse triage. Secondary analyses showed URGENTIAPARSE to be effective in predicting hospitalization needs (GEMSA) and robust across both structured data and raw transcripts. Conclusions Among the evaluated architectures, the LLM model demonstrated the most accurate triage predictions. Integrating LLM-based AI into ED workflows has the potential to improve patient safety and operational efficiency. Future work should focus on overcoming model limitations and ensuring transparent, ethical implementation.


 Citation

Please cite as:

Lansiaux E, Azzouz R, Chazard E, Vromant A, Wiel E

ARTIFICIAL INTELLIGENCE MODELS FOR PREDICTING TRIAGE IN EMERGENCY DEPARTMENTS: A 7-MONTH RETROSPECTIVE COMPARATIVE SARTIFICIAL INTELLIGENCE MODELS FOR PREDICTING TRIAGE IN EMERGENCY DEPARTMENTS: A 7-MONTH RETROSPECTIVE COMPARATIVE STUDY OF NLP, LLM, AND JEPA ARCHITECTURES

JMIR Medical Informatics. 22/12/2025:83318 (forthcoming/in press)

DOI: 10.2196/83318

URL: https://preprints.jmir.org/preprint/83318

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.