Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Jun 13, 2024
Date Accepted: Dec 18, 2024
Large Language Models based Critical Care Big Data Deployment and Extraction: Descriptive Analysis.
ABSTRACT
Background:
Publicly accessible critical care-related databases contain enormous clinical data but their utilization often requires advanced programming skills. However, the growing complexity of large databases and unstructured data presents challenges for clinicians who need programming or data analysis expertise to utilize these systems directly.
Objective:
The study aims to simplify critical care-related databases deployment and extraction via large language models.
Methods:
The development of this platform was a two-step process. First, we enabled automated database deployment using Docker container technology, with incorporated web-based analytics interfaces Metabase and Superset. Second, we developed the Intensive care unit - Generative Pre-trained Transformer (ICU-GPT), a large language model fine-tuned on Intensive care unit (ICU) data integrated LangChain and Microsoft AutoGen.
Results:
The automated deployment platform was designed with user-friendliness in mind, enabling clinicians to deploy one or multiple databases in local, cloud, or remote environments without the need for manual setup. After successfully overcoming GPT’s token limit and supporting multi-schemas data, ICU-GPT could generate Structured Query Language (SQL) queries and extract insights from ICU datasets based on request input. A front-end user interface was developed to for clinicians to achieve code-free SQL generation on the web-based client.
Conclusions:
By harnessing the power of our automated deployment platform and ICU-GPT model, clinicians are empowered to easily visualize, extract, and arrange critical care-related databases more efficiently and flexibly than manual methods. Our research could decrease the time and effort spent on complex bioinformatics methods and advance clinical research.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.