Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR AI

Date Submitted: Mar 21, 2024
Date Accepted: Nov 7, 2024

The final, peer-reviewed published version of this preprint can be found here:

Leveraging Medical Knowledge Graphs Into Large Language Models for Diagnosis Prediction: Design and Application Study

Gao Y, Li R, Croxford E, Caskey JR, Patterson BW, Churpek MM, Miller T, Dligach D, Afshar M

Leveraging Medical Knowledge Graphs Into Large Language Models for Diagnosis Prediction: Design and Application Study

JMIR AI 2025;4:e58670

DOI: 10.2196/58670

PMID: 39993309

PMCID: 11894347

Leveraging A Medical Knowledge Graph into Large Language Models for Diagnosis Prediction: Design and Application Study

  • Yanjun Gao; 
  • Ruizhe Li; 
  • Emma Croxford; 
  • John R Caskey; 
  • Brian W Patterson; 
  • Matthew M. Churpek; 
  • Timothy Miller; 
  • Dmitriy Dligach; 
  • Majid Afshar

ABSTRACT

Background:

Electronic Health Records (EHRs) and routine documentation practices are crucial for providing comprehensive health records, diagnoses, and treatments for patients' daily care. However, the complexity and verbosity of EHR narratives can overload healthcare providers and risk diagnostic inaccuracies.

Objective:

This study aims to enhance the proficiency of Large Language Models (LLMs) with a medical Knowledge Graph in automated diagnosis generation by minimizing diagnostic errors and preventing patient harm.

Methods:

We introduced an innovative approach that integrates a medical knowledge graph (KG) and a novel graph model, Dr.KNOWs, inspired by clinical diagnostic reasoning processes. Our approach utilized the National Library of Medicine's Unified Medical Language System (UMLS) to derive a KG, a robust repository of biomedical knowledge. This method eschews the need for pre-training, leveraging the KG as an auxiliary tool for interpreting and summarizing complex medical concepts. We evaluated our model's performance for intrinsic evaluation of predicting the correct concepts for diagnoses, and extrinsic evaluation of enhancing language models in diagnosis prediction task. We also conducted human evaluation to score the “Reasoning” section generated by language models for explanability.

Results:

Our proposed knowledge graph model significantly surpassed traditional concept extractors in identifying accurate diagnosis concepts, with a concept-based F-score of 25.20 (95% CI: 23.93-26.98) compared to the extractor's 21.13 (95% CI: 19.85-22.41). In a diagnosis prediction shared task dataset, ChatGPT with predicted paths input achieved a ROUGE score of 25.43 (95% CI: 23.53-25.35), outperforming its no-path version, which scored 21.23 (95% CI: 19.58-21.72). The open-box T5 model attained a ROUGE score of 30.72, ranking third on the task's current leaderboard. Human evaluations revealed that models with DR.KNOWs predicted paths aligned more closely with human reasoning, showing a significant improvement over models without paths (P<0.01).

Conclusions:

This study underscores the potential of integrating medical KGs with LLMs to refine AI-driven diagnostic processes, highlighting the significance of external knowledge sources in creating explainable diagnostic pathways and advancing towards AI-enhanced diagnostic decision support systems.


 Citation

Please cite as:

Gao Y, Li R, Croxford E, Caskey JR, Patterson BW, Churpek MM, Miller T, Dligach D, Afshar M

Leveraging Medical Knowledge Graphs Into Large Language Models for Diagnosis Prediction: Design and Application Study

JMIR AI 2025;4:e58670

DOI: 10.2196/58670

PMID: 39993309

PMCID: 11894347

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.