Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Jun 13, 2024
Date Accepted: Dec 18, 2024

The final, peer-reviewed published version of this preprint can be found here:

Large Language Model–Based Critical Care Big Data Deployment and Extraction: Descriptive Analysis

Yang Z, Xu SS, Liu X, Chen Y, Wang S, Miao MY, Hou M, Liu S, Zhou YM, Xu N, Zhou JX, Zhang L

Large Language Model–Based Critical Care Big Data Deployment and Extraction: Descriptive Analysis

JMIR Med Inform 2025;13:e63216

DOI: 10.2196/63216

PMID: 40079079

PMCID: 11922493

Large Language Models based Critical Care Big Data Deployment and Extraction: Descriptive Analysis.

  • Zhongbao Yang; 
  • Shan-Shan Xu; 
  • Xiaozhu Liu; 
  • Yuqing Chen; 
  • Shuya Wang; 
  • Ming-Yue Miao; 
  • Mengxue Hou; 
  • Shuai Liu; 
  • Yi-Min Zhou; 
  • Ningyuan Xu; 
  • Jian-Xin Zhou; 
  • Linlin Zhang

ABSTRACT

Background:

Publicly accessible critical care-related databases contain enormous clinical data but their utilization often requires advanced programming skills. However, the growing complexity of large databases and unstructured data presents challenges for clinicians who need programming or data analysis expertise to utilize these systems directly.

Objective:

The study aims to simplify critical care-related databases deployment and extraction via large language models.

Methods:

The development of this platform was a two-step process. First, we enabled automated database deployment using Docker container technology, with incorporated web-based analytics interfaces Metabase and Superset. Second, we developed the Intensive care unit - Generative Pre-trained Transformer (ICU-GPT), a large language model fine-tuned on Intensive care unit (ICU) data integrated LangChain and Microsoft AutoGen.

Results:

The automated deployment platform was designed with user-friendliness in mind, enabling clinicians to deploy one or multiple databases in local, cloud, or remote environments without the need for manual setup. After successfully overcoming GPT’s token limit and supporting multi-schemas data, ICU-GPT could generate Structured Query Language (SQL) queries and extract insights from ICU datasets based on request input. A front-end user interface was developed to for clinicians to achieve code-free SQL generation on the web-based client.

Conclusions:

By harnessing the power of our automated deployment platform and ICU-GPT model, clinicians are empowered to easily visualize, extract, and arrange critical care-related databases more efficiently and flexibly than manual methods. Our research could decrease the time and effort spent on complex bioinformatics methods and advance clinical research.


 Citation

Please cite as:

Yang Z, Xu SS, Liu X, Chen Y, Wang S, Miao MY, Hou M, Liu S, Zhou YM, Xu N, Zhou JX, Zhang L

Large Language Model–Based Critical Care Big Data Deployment and Extraction: Descriptive Analysis

JMIR Med Inform 2025;13:e63216

DOI: 10.2196/63216

PMID: 40079079

PMCID: 11922493

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.