Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Jun 12, 2023
Date Accepted: Oct 24, 2023

The final, peer-reviewed published version of this preprint can be found here:

A Large Language Model Screening Tool to Target Patients for Best Practice Alerts: Development and Validation

Savage T, Shieh L

A Large Language Model Screening Tool to Target Patients for Best Practice Alerts: Development and Validation

JMIR Med Inform 2023;11:e49886

DOI: 10.2196/49886

PMID: 38010803

PMCID: 10714262

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Large Language Models: A Cure for Physician Alert Fatigue

  • Thomas Savage; 
  • Lisa Shieh

ABSTRACT

Background:

Best Practice Alerts (BPAs) are alert messages to physicians in the Electronic Health Record (EHR) that are used to encourage appropriate utilization of healthcare resources. While these alerts are helpful in both improving care and reducing costs, the alerts cause significant burden on physicians. The development of Large Language Models provides an opportunity to selectively identify target patients and reduce physician alert fatigue.

Objective:

We propose a new application for Large Language Models (LLMs) in Quality Improvement to selectively identify patients for Best Practice Alerts. An LLM system can increase alert efficiency as well as reduce physician alert fatigue. In this paper we present an example case of an LLM used to optimize patient selection for a BPA encouraging prescription of deep venous thrombosis (DVT) prophylaxis.

Methods:

Using a simulated population from the MIMIC III dataset we present an LLM system that selectively targets patients appropriate for a DVT prophylaxis BPA by identifying and excluding patients experiencing acute bleeding. The LLM system uses GPT-3 to create a synthetic training set and fine-tune a classification BioMed-RoBERTa model. Model classification results were compared to a physician reviewer.

Results:

The model achieved impressive accuracy, favoring high sensitivity or specificity depending on the characteristics of the training dataset. An ROC AUC of 0.89 with a sensitivity of 88% and specificity of 77% was achieved when negation examples were not included in the synthetic training set. In contrast the model achieved an ROC AUC of 0.85 with a sensitivity of 56% and specificity of 90% when negation examples were included. The models increased BPA efficiency by 34% and 22% respectively compared to a simple rule-based algorithm.

Conclusions:

These results demonstrate how language models can significantly improve efficiency of Best Practice Alerts over current rule-based algorithms, minimizing physician alert fatigue. We provide an example application that uses a simple RoBERTa model deployed with minimal compute power. Larger models (ex. GPT3, PaLM, etc) could exhibit even superior performance. We anticipate that the development of HIPAA-compliant platforms for larger language models will even further expand the potential of LLMs to revolutionize the field of Quality Improvement.


 Citation

Please cite as:

Savage T, Shieh L

A Large Language Model Screening Tool to Target Patients for Best Practice Alerts: Development and Validation

JMIR Med Inform 2023;11:e49886

DOI: 10.2196/49886

PMID: 38010803

PMCID: 10714262

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.