Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Jun 25, 2023
Date Accepted: Apr 17, 2024

The final, peer-reviewed published version of this preprint can be found here:

Retrieval-Based Diagnostic Decision Support: Mixed Methods Study

Abdullahi TA, Mercurio L, Singh R, Eickhoff C

Retrieval-Based Diagnostic Decision Support: Mixed Methods Study

JMIR Med Inform 2024;12:e50209

DOI: 10.2196/50209

PMID: 38896468

PMCID: 11222760

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Retrieval-Based Diagnostic Decision Support

  • Tassallah Amina Abdullahi; 
  • Laura Mercurio; 
  • Ritambhara Singh; 
  • Carsten Eickhoff

ABSTRACT

Background:

Diagnostic errors pose significant health risks and contribute to patient mortality. With the growing accessibility of electronic health records, machine learning models offer a promising avenue for enhancing diagnosis quality. Current research has primarily focused on a limited set of diseases with ample training data, neglecting diagnostic scenarios with limited data availability.

Objective:

This study aims to develop an information retrieval (IR) based framework that accommodates data sparsity to facilitate broader diagnostic decision support.

Methods:

We present an IR-based diagnostic decision support framework called CliniqIR. It employs clinical text records, the Unified Medical Language System (UMLS) Metathesaurus, and 33M PubMed abstracts to classify a broad spectrum of diagnoses independent of training data availability. We compare CliniqIR's performance to pre-trained clinical transformer models (like ClinicalBERT) in supervised and zero-shot settings. Subsequently, we combine the strength of supervised fine-tuned ClinicalBERT and CliniqIR to build an ensemble framework that delivers state-of-the-art diagnostic predictions.

Results:

CliniqIR returns the correct diagnosis for a DC3 case among its top-3 predictions, on average, on a rare disease dataset (DC3) with no training data. On the MIMIC-III dataset, CliniqIR outperforms ClinicalBERT in predicting diagnoses with fewer than five training samples by an average Mean Reciprocal Rank (MRR) of 9%. In a zero-shot setting, where no specific training was conducted, CliniqIR also outperforms the pre-trained transformer models by an MRR of 10%. Furthermore, our ensemble framework surpassed the individual constituent models by a minimum of 8% in MRR.

Conclusions:

Our experiments highlight the importance of IR in leveraging unstructured knowledge resources to identify infrequently encountered diagnoses. In addition, our ensemble framework benefits from combining the complementary strengths of the supervised and retrieval-based models to diagnose a broad spectrum of diseases.


 Citation

Please cite as:

Abdullahi TA, Mercurio L, Singh R, Eickhoff C

Retrieval-Based Diagnostic Decision Support: Mixed Methods Study

JMIR Med Inform 2024;12:e50209

DOI: 10.2196/50209

PMID: 38896468

PMCID: 11222760

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.