Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Aug 18, 2024
Date Accepted: Feb 18, 2025

The final, peer-reviewed published version of this preprint can be found here:

Large Language Model–Driven Knowledge Graph Construction in Sepsis Care Using Multicenter Clinical Databases: Development and Usability Study

Shen B, Yang H, li j, Zhang C, Sierra AP

Large Language Model–Driven Knowledge Graph Construction in Sepsis Care Using Multicenter Clinical Databases: Development and Usability Study

J Med Internet Res 2025;27:e65537

DOI: 10.2196/65537

PMID: 40146985

PMCID: 11986385

LLM-Driven Knowledge Graph Construction in Sepsis Care: Framework Development with Multicenter Clinical Databases

  • Bairong Shen; 
  • Hao Yang; 
  • jiaxi li; 
  • Chi Zhang; 
  • Alejandro Pazos Sierra

ABSTRACT

Background:

Sepsis is a complex, life-threatening condition that presents significant challenges due to its heterogeneity and the vast, unstructured data associated with it. Traditional methods of knowledge graph construction struggle to manage this complexity effectively.

Objective:

This study aims to harness the capabilities of large language models (LLMs) in conjunction with extensive real-world data, to develop a detailed and methodical knowledge graph focused on sepsis. The goal is to enhance our comprehension of sepsis and to furnish actionable insights for its clinical management. Additionally, we established a multicenter sepsis database (MSD) to enrich our analysis and findings.

Methods:

Our methodology involved the collection of clinical guidelines, public databases, and substantial real-world data pertinent to sepsis. Utilizing GPT-4.0, we executed we carried out tasks of relationship extraction and entity recognition tasks through innovative prompt engineering techniques. This approach facilitated the construction of a nuanced sepsis knowledge graph.

Results:

We established a sepsis database that includes three centers and encompasses over 10,000 individuals. Importantly, we identified nine entity concepts and types and defined eight semantic relationships, successfully integrating the gathered knowledge through the Unified Medical Language System (UMLS) entity linker. As a result, we compiled a comprehensive sepsis knowledge graph, comprising 1,894 nodes and 2,021 distinct relationships.

Conclusions:

This study represents a groundbreaking effort in employing prompt engineering with GPT4.0 to establish a database and knowledge graph, thereby facilitating a systematic and in-depth understanding of sepsis. The inventive application of prompt engineering opens new avenues for the advancement of knowledge graphs, providing significant technical support to enhance both the efficiency and quality of knowledge graph construction.


 Citation

Please cite as:

Shen B, Yang H, li j, Zhang C, Sierra AP

Large Language Model–Driven Knowledge Graph Construction in Sepsis Care Using Multicenter Clinical Databases: Development and Usability Study

J Med Internet Res 2025;27:e65537

DOI: 10.2196/65537

PMID: 40146985

PMCID: 11986385

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.