Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR AI

Date Submitted: Oct 26, 2024
Date Accepted: Mar 21, 2025

The final, peer-reviewed published version of this preprint can be found here:

Training Language Models for Estimating Priority Levels in Ultrasound Examination Waitlists: Algorithm Development and Validation

Masayoshi K, Hashimoto M, Toda N, Mori H, Kobayashi G, Haque H, Mizuki S, Jinzaki M

Training Language Models for Estimating Priority Levels in Ultrasound Examination Waitlists: Algorithm Development and Validation

JMIR AI 2025;4:e68020

DOI: 10.2196/68020

PMID: 40694843

PMCID: 12325119

Training Language Models for Estimating Priority Levels in Ultrasound Examination Waitlists

  • Kanato Masayoshi; 
  • Masahiro Hashimoto; 
  • Naoki Toda; 
  • Hirozumi Mori; 
  • Goh Kobayashi; 
  • Hasnine Haque; 
  • Sou Mizuki; 
  • Masahiro Jinzaki

ABSTRACT

Background:

Ultrasound examinations, while valuable, are time-consuming and often limited in availability. Consequently, many hospitals implement reservation systems; however, these systems typically lack prioritization for examination purposes. Hence, our hospital uses a waitlist system that prioritizes examination requests based on their clinical value when slots become available due to cancellations. This system, however, requires a manual review of examination purposes, which are recorded in free-form text. We hypothesized that AI language models could preliminarily estimate the priority of requests prior to manual reviews.

Objective:

This study aimed (1) to investigate potential challenges associated with using language models for estimating the priority of medical examination requests and (2) to evaluate the performance of language models in processing Japanese medical texts.

Methods:

We retrospectively collected ultrasound examination requests from the waitlist system at Keio University Hospital, spanning January 2020 to March 2023. Each request comprised an examination purpose documented by the requesting physician and a six-tier priority level assigned by a radiologist during the clinical workflow. We fine-tuned JMedRoBERTa, Luke, OpenCalm, and LLaMA2 under two conditions: (1) tuning only the final layer and (2) tuning all layers using either standard backpropagation or low-rank adaptation (LoRA).

Results:

We had 2335 and 204 requests in the training and test datasets post-cleaning. When only the final layers were tuned, JMedRoBERTa outperformed the other models (Kendall coefficient = 0.225). With full fine-tuning, JMedRoBERTa continued to perform best (Kendall’s coefficient = 0.254), though with reduced margins compared to the other models. The radiologists’ re-evaluation yielded a Kendall coefficient of 0.172.

Conclusions:

Language models can estimate the priority of examination requests with accuracy comparable to human radiologists. The fine-tuning results indicate that general-purpose language models can be adapted to domain-specific texts (i.e., Japanese medical texts) with sufficient fine-tuning. Further research is required to address priority rank ambiguity, expand the dataset across multiple institutions, and explore more recent language models with potentially higher performance or better suitability for this task.


 Citation

Please cite as:

Masayoshi K, Hashimoto M, Toda N, Mori H, Kobayashi G, Haque H, Mizuki S, Jinzaki M

Training Language Models for Estimating Priority Levels in Ultrasound Examination Waitlists: Algorithm Development and Validation

JMIR AI 2025;4:e68020

DOI: 10.2196/68020

PMID: 40694843

PMCID: 12325119

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.