Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Apr 17, 2024
Date Accepted: Sep 11, 2024

The final, peer-reviewed published version of this preprint can be found here:

Examining the Role of Large Language Models in Orthopedics: Systematic Review

Zhang C, Liu S, Zhou X, Zhou S, Tian Y, Wang S, Xu N, Li W

Examining the Role of Large Language Models in Orthopedics: Systematic Review

J Med Internet Res 2024;26:e59607

DOI: 10.2196/59607

PMID: 39546795

PMCID: 11607553

Role of Large Language Model in Orthopaedics: Systematic Review

  • Cheng Zhang; 
  • Shanshan Liu; 
  • Xingyu Zhou; 
  • Siyu Zhou; 
  • Yinglun Tian; 
  • Shenglin Wang; 
  • Nanfang Xu; 
  • Weishi Li

ABSTRACT

Background:

Large Language Models (LLMs) can understand natural language and generate corresponding text, images, and even videos based on prompts, which holds great potential in medical scenarios. Orthopaedics is a significant branch of medicine and the orthopaedic diseases contribute to a significant socioeconomic burden, which would be alleviated by the application of LLMs. Several pioneers have conducted researches on LLMs across various subspecialties of orthopaedics to explore the performance in addressing different issues. However, there are currently few reviews and summaries of these studies and a systematic summary of the existing research is absent.

Objective:

The objective of this review is to comprehensively summarize the research findings on the application of LLMs in the field of orthopaedics and explore the potential opportunities and challenges.

Methods:

PubMed, Embase, and Cochrane library databases were searched from January 1, 2014, to February 22, 2024, with the language limited to English. The terms were divided in two categories: large language model and orthopaedics, including variants of "large language model", "generative artificial intelligence", "ChatGPT", and "orthopaedics". Study selection process was conducted according to the inclusion and exclusion criteria after searching. Quality of the clinical researches were assessed using the Revised Cochrane risk-of-bias tool for randomized trials and the CONSORT guidance - AI extension. Data Extraction and synthesis were conducted after the quality assessment.

Results:

A total of 68 studies were selected and the application of LLMs in orthopaedics involves the fields of clinical practice, education, research, and management. Among these studies, 69.1% were focused on clinical practice, 17.6% were pertained to orthopaedic education, 11.8% were related to scientific research, and 1.5% was in the field of management. There were only 8 clinical studies recruiting patients, with only one high-quality randomized controlled trial. ChatGPT was the most commonly mentioned LLMs software. There was considerable heterogeneity in the definition, measurement, and evaluation of LLMs performance across different studies. For diagnostic tasks alone, the accuracy ranged from 55% to 93%. When performing disease classification tasks, ChatGPT 4's accuracy ranged from 2% to 100%. For answering the questions in orthopaedic exams, the scores were ranging from 45% to 73.6% due to differences in models and test selections.

Conclusions:

The LLMs cannot replace orthopaedic professionals in completing various tasks in the short term. However, employing LLMs as a copilot could be a potential approach to effectively enhance work efficiency at present. More high-quality clinical trials are needed in the future, aiming to identify optimal applications of LLMs and advance orthopaedics towards higher efficiency and precision.


 Citation

Please cite as:

Zhang C, Liu S, Zhou X, Zhou S, Tian Y, Wang S, Xu N, Li W

Examining the Role of Large Language Models in Orthopedics: Systematic Review

J Med Internet Res 2024;26:e59607

DOI: 10.2196/59607

PMID: 39546795

PMCID: 11607553

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.