Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Sep 2, 2025
Date Accepted: Feb 24, 2026

The final, peer-reviewed published version of this preprint can be found here:

A Bilingual Arabic-English Ambient AI Scribe for Clinical Documentation: Prospective Evaluation Study

Khan UT, Khan AT, Aljaadi W, Alhadlaq R, Baqashmer Z, Alsafi Y, Alomran Y, Al Rusaiyes M, Radif M, Khan TN, Altamimi SAS

A Bilingual Arabic-English Ambient AI Scribe for Clinical Documentation: Prospective Evaluation Study

JMIR Med Inform 2026;14:e83335

DOI: 10.2196/83335

PMID: 41875245

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

End-to-End Development, Implementation, and Evaluation of a Bilingual Arabic-English AI Medical Scribe

  • Umair Tahir Khan; 
  • Ammar Tahir Khan; 
  • Waleed Aljaadi; 
  • Razan Alhadlaq; 
  • Zahran Baqashmer; 
  • Yasin Alsafi; 
  • Yousef Alomran; 
  • Maha Al Rusaiyes; 
  • Muaddiyah Radif; 
  • Tahir Naeem Khan; 
  • Saleh Abdullah Saleh Altamimi

ABSTRACT

Background:

Medical Ambient AI scribes reduce documentation burden, but evidence is almost entirely from English systems. In the Arabic-speaking world, physicians converse mainly in Arabic and write clinical notes in English, adding cognitive burden. Due to scarce corpora in the Arabic language, development of Ambient scribes in the bilingual Arabic-English setting has been limited. Herein, we describe the development and deployment of a bilingual scribe in the clinical setting.

Objective:

To evaluate the feasibility and performance of a bilingual Medical Arabic-English AI scribe, Sahl AI, using a full end-to-end methodology from raw audio to clinical note.

Methods:

Phase-II, single-arm pilot study was conducted in two stages: (i) development and (ii) implementation, across outpatient, inpatient, and primary-care clinics within the Riyadh First Health Cluster. In the development stage, consultation audios were collected and manually annotated to fine-tune the AI pipeline; technical feasibility was assessed across 64 encounters. In the implementation stage, the refined system generated notes for 55 real-world consultations. Notes were evaluated independently by two blinded physicians using a modified Physician Documentation Quality Instrument-9 (PDQI-9). Additionally, 22 participating clinicians completed structured surveys on usability, workflow integration, and perceived time savings. Main outcomes included documentation quality (PDQI-9 scores), comparative performance across Arabic and English notes, and physician-reported usability and time-saving potential.

Results:

During the development stage, the AI pipeline was fine-tuned producing version 1 of Sahl AI which was tested in 64 encounters for technical feasibility giving an overall modified PQDI-9 score of 93.7% (42.2/45), however the model was noted to have an accuracy of 87% (4.35/5). The AI pipeline was further fine-tuned and in the implementation stage, version 2 of Sahl AI was tested with 55 consecutive consultations for real-world evaluation, with two independent physicians. Sahl AI achieved an average PDQI-9 score of 94.3%. Arabic and English notes performed similarly (94.1% vs 94.5%), with accuracy rated 90.5% for Arabic vs 95.3% for English, respectively (P=.054). Internal Consistency (98.7%) and Comprehensibility (97.8%) were the top-rated domains. All 22 surveyed physicians agreed or strongly agreed that notes were comprehensive with 95% perceiving potential time savings and reduced burnout.

Conclusions:

Sahl AI, a bilingual Arabic-English medical ambient AI scribe, generates accurate and high-quality notes, reducing cognitive load for clinicians and offering a scalable documentation tool for bilingual care. This provides the first empirical basis for rigorous end-to-end AI scribe evaluation in low-resource languages.


 Citation

Please cite as:

Khan UT, Khan AT, Aljaadi W, Alhadlaq R, Baqashmer Z, Alsafi Y, Alomran Y, Al Rusaiyes M, Radif M, Khan TN, Altamimi SAS

A Bilingual Arabic-English Ambient AI Scribe for Clinical Documentation: Prospective Evaluation Study

JMIR Med Inform 2026;14:e83335

DOI: 10.2196/83335

PMID: 41875245

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.