JMIR Preprints #82885: ChatGPT vs UpToDate: A Cross-Sectional Analysis of Alignment Across Preclinical Medical Topics Using TF-IDF Cosine Similarity

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

ChatGPT vs UpToDate: A Cross-Sectional Analysis of Alignment Across Preclinical Medical Topics Using TF-IDF Cosine Similarity

Shankar S Thiru;
Nicholas E Aksu;
Matthew Chiang;
Daniel O Gallagher;
Mary Furlong;
Elizabeth R Prevou;
Akhil Jay Khanna

ABSTRACT

Background:

ChatGPT is increasingly relied upon as a study tool among medical trainees during the preclinical curricular phase, raising concern about its accuracy and reliability.

Objective:

The aim of this study is to compare ChatGPT 4o mini to UpToDate with the purpose of assessing for similarity.

Methods:

We queried a total of 150 preclinical-level questions: 30 biochemistry, 30 immunology, 30 microbiology, 30 pharmacology, and 30 pathology. ChatGPT was asked each question 5 times to account for stochasticity. Next, a text network analysis was performed using cosine comparisons of term frequency inverse-document frequency (TF-IDF) to gauge similarity between ChatGPT and UpToDate responses per question for each subject. A statistical reference (p = 0.05) for interpretation of TF-IDF values was generated using random text samples with same length distribution as the UpToDate responses. TF-IDF similarity of ChatGPT responses to overall subject category was also performed.

Results:

ChatGPT responses were most similar to UpToDate with regard to answering pharmacology questions (TF-IDF 0.3380.134). ChatGPT’s response similarity to UpToDate for the remaining subjects were 0.3210.142 for pathology, 0.296±0.120 for biochemistry, 0.2970.108 for microbiology, and 0.2750.102 for immunology. Reference TF-IDF scores of randomly generated text were 0.262, 0.279, 0.243, 0.267, and 0.281 for biochemistry, immunology, microbiology, pharmacology, and pathology respectively.

Conclusions:

The majority of ChatGPT responses are similar to UpToDate responses for preclinical questions across the subjects of biochemistry, immunology, microbiology, pharmacology, and pathology. Thus, ChatGPT may have a role in medical training during the preclinical curricular phase with the caveat that its utility may vary based on subject.

Citation

Please cite as:

Thiru SS, Aksu NE, Chiang M, Gallagher DO, Furlong M, Prevou ER, Khanna AJ

ChatGPT versus UpToDate in Preclinical Medical Education: Cross-Sectional Analysis Using Term Frequency–Inverse Document Frequency Cosine Similarity

JMIR Med Educ 2026;12:e82885

DOI: 10.2196/82885

PMID: 41861392

PMCID: 13004592

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Education

Date Submitted: Oct 26, 2025

Open Peer Review Period: Aug 23, 2025 - Oct 18, 2025

Date Accepted: Feb 5, 2026

(closed for review but you can still tweet)

ChatGPT vs UpToDate: A Cross-Sectional Analysis of Alignment Across Preclinical Medical Topics Using TF-IDF Cosine Similarity

ABSTRACT

Citation

Copyright