JMIR Preprints #91571: Large Language Models And Examination Performance In Healthcare Education: A Bibliometric Analysis

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Large Language Models And Examination Performance In Healthcare Education: A Bibliometric Analysis

Tamara Trajbarič;
Adrijana Svenšek;
Gregor Štiglic;
Lucija Gosak

ABSTRACT

Background:

Large language models (LLMs) are increasingly used and evaluated in health professions education, including studies assessing model performance on healthcare examination questions. The rapid growth and heterogeneity of this literature make it difficult to track research concentration, collaboration patterns, and emerging themes.

Objective:

To map publication trends, key contributors, collaboration networks, and thematic hotspots in research on LLM-supported exam solving in healthcare education.

Methods:

We conducted a bibliometric analysis of publications from 2023–2025. Searches were performed in PubMed, Scopus, CINAHL Ultimate (EBSCOhost), and Web of Science using structured terms for AI/LLMs (eg, ChatGPT, generative AI, large language models) combined with healthcare education and training concepts. Eligible studies addressed AI-based technologies within healthcare education or training contexts; studies focused solely on clinical practice or non-educational applications were excluded. Bibliographic metadata from PubMed (TXT) and Scopus (BIB) were merged and analyzed using bibliometrix/Biblioshiny (R) and VOSviewer to quantify productivity, collaboration (including international co-authorship), and keyword co-occurrence patterns.

Results:

The dataset comprised 262 documents from 158 sources, with an annual publication growth rate of 36.58% and a mean document age of 1.83 years. A total of 1,351 authors contributed (mean 5.97 co-authors per document); international co-authored publications accounted for 13.36%. Most records were journal articles (253/262), followed by letters (8/262) and one conference paper. Annual output rose from 52 (2023) to 113 (2024; +117.3%), then decreased to 97 (2025; −14.2% vs 2024) while remaining above 2023 levels. JMIR Medical Education published the most articles on this topic (34/262), followed by Scientific Reports (9/262) and BMC Medical Education (7/262). Frequent keywords included “humans” (n=144), “artificial intelligence” (n=82), “generative AI” (n=30), and “large language models” (n=20); education-focused terms such as “educational measurement/methods” were also prominent (n=76).

Conclusions:

Research on LLMs and exam performance in healthcare education expanded rapidly from 2023–2025, with publication activity concentrated in a limited set of journals and relatively low international collaboration. Thematic patterns emphasize assessment-related outcomes and LLM/ChatGPT performance, supporting the need for more comparable, transparent reporting (eg, prompts and model versions) and education-centered outcomes beyond accuracy in future studies. Clinical Trial: /

Citation

Please cite as:

Trajbarič T, Svenšek A, Štiglic G, Gosak L

Large Language Models And Examination Performance In Healthcare Education: A Bibliometric Analysis

JMIR Preprints. 16/01/2026:91571

DOI: 10.2196/preprints.91571

URL: https://preprints.jmir.org/preprint/91571

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Previously submitted to: JMIR Medical Education (no longer under consideration since Mar 30, 2026)

Date Submitted: Jan 16, 2026

Open Peer Review Period: Jan 16, 2026 - Mar 13, 2026

(closed for review but you can still tweet)

NOTE: This is an unreviewed Preprint

Large Language Models And Examination Performance In Healthcare Education: A Bibliometric Analysis

ABSTRACT

Citation

Copyright