JMIR Preprints #53335: Guideline-incorporated Large Language Model-Driven Evaluation of Medical Records Using MedCheckLLM

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Guideline-incorporated Large Language Model-Driven Evaluation of Medical Records Using MedCheckLLM

Varun Venkataramani;
Marc Cicero Schubert;
Wolfgang Wick

ABSTRACT

Large Language Models (LLMs) have been utilized across a multitude of applications, demonstrating enormous potential in processing and comprehending complex datasets in healthcare. One area yet to be thoroughly explored is the application of LLMs for the reliable and reproducible evaluation of medical documents. Automatic evaluation of these documents, if achieved effectively, has the potential to improve healthcare, enhance patient safety, reduce the risk of cognitive and other biases, and refine the training process of LLMs. Importantly, it is essential that the system's reasoning process is a) transparent and comprehensible to human evaluators such as a checklist completion, and b) is guided by established medical guidelines proven to increase patient safety and the gold standard for implementing clinical care, thereby elevating the overall performance and applicability of AI-driven healthcare. In this study, we introduce a framework which is based on a multi-step approach for medical record evaluation that incorporates guidelines directly into the evaluation process, a concept we term 'guideline-in-the-loop'. Our proposed algorithm, named MedCheckLLM, is an LLM-driven structured, layered reasoning mechanism designed to automate the evaluation of medical records, with a particular emphasis on the evaluation against evidence-based guidelines. Crucially, the guidelines are deterministally accessed by the LLM as out-of-training data. This rigorous separation of LLM and guidelines is expected to lead to increased validity and interpretability of the evaluations and offers flexibility for updating guidelines. The primary objective of this research is to introduce the conceptual framework and assess its feasibility. This approach is expected to have significant implications on healthcare quality and the transparent and efficient application of LLMs in clinical settings.

Citation

Please cite as:

Venkataramani V, Schubert MC, Wick W

Guideline-Incorporated Large Language Model-Driven Evaluation of Medical Records Using MedCheckLLM

JMIR Form Res 2025;9:e53335

DOI: 10.2196/53335

PMID: 40272831

PMCID: 12045122

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Formative Research

Date Submitted: Oct 3, 2023

Date Accepted: Nov 17, 2024

Guideline-incorporated Large Language Model-Driven Evaluation of Medical Records Using MedCheckLLM

ABSTRACT

Citation

Copyright