JMIR Preprints #50209: Retrieval-Based Diagnostic Decision Support

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Retrieval-Based Diagnostic Decision Support

Tassallah Amina Abdullahi;
Laura Mercurio;
Ritambhara Singh;
Carsten Eickhoff

ABSTRACT

Background:

Diagnostic errors pose significant health risks and contribute to patient mortality. With the growing accessibility of electronic health records, machine learning models offer a promising avenue for enhancing diagnosis quality. Current research has primarily focused on a limited set of diseases with ample training data, neglecting diagnostic scenarios with limited data availability.

Objective:

This study aims to develop an information retrieval (IR) based framework that accommodates data sparsity to facilitate broader diagnostic decision support.

Methods:

We present an IR-based diagnostic decision support framework called CliniqIR. It employs clinical text records, the Unified Medical Language System (UMLS) Metathesaurus, and 33M PubMed abstracts to classify a broad spectrum of diagnoses independent of training data availability. We compare CliniqIR's performance to pre-trained clinical transformer models (like ClinicalBERT) in supervised and zero-shot settings. Subsequently, we combine the strength of supervised fine-tuned ClinicalBERT and CliniqIR to build an ensemble framework that delivers state-of-the-art diagnostic predictions.

Results:

CliniqIR returns the correct diagnosis for a DC3 case among its top-3 predictions, on average, on a rare disease dataset (DC3) with no training data. On the MIMIC-III dataset, CliniqIR outperforms ClinicalBERT in predicting diagnoses with fewer than five training samples by an average Mean Reciprocal Rank (MRR) of 9%. In a zero-shot setting, where no specific training was conducted, CliniqIR also outperforms the pre-trained transformer models by an MRR of 10%. Furthermore, our ensemble framework surpassed the individual constituent models by a minimum of 8% in MRR.

Conclusions:

Our experiments highlight the importance of IR in leveraging unstructured knowledge resources to identify infrequently encountered diagnoses. In addition, our ensemble framework benefits from combining the complementary strengths of the supervised and retrieval-based models to diagnose a broad spectrum of diseases.

Citation

Please cite as:

Abdullahi TA, Mercurio L, Singh R, Eickhoff C

Retrieval-Based Diagnostic Decision Support: Mixed Methods Study

JMIR Med Inform 2024;12:e50209

DOI: 10.2196/50209

PMID: 38896468

PMCID: 11222760

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Jun 25, 2023

Date Accepted: Apr 17, 2024

Retrieval-Based Diagnostic Decision Support

ABSTRACT

Citation

Copyright