Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Education

Date Submitted: Mar 10, 2025
Date Accepted: Mar 12, 2025

The final, peer-reviewed published version of this preprint can be found here:

Authors’ Reply: Citation Accuracy Challenges Posed by Large Language Models

Temsah MH, Al-Eyadhy A, Jamal A, Alhasan K, Malki KH

Authors’ Reply: Citation Accuracy Challenges Posed by Large Language Models

JMIR Med Educ 2025;11:e73698

DOI: 10.2196/73698

PMID: 40173373

PMCID: 12037898

Authors’ Reply: Comment on: Perceptions and Earliest Experiences of Medical Students and Faculty With ChatGPT in Medical Education: Qualitative Study

  • Mohamad-Hani Temsah; 
  • Ayman Al-Eyadhy; 
  • Amr Jamal; 
  • Khalid Alhasan; 
  • Khalid H Malki

ABSTRACT

Large language models (LLMs) have demonstrated significant potential in academic research but face challenges in generating accurate citations. The issue of hallucinated references—well-formatted but fictitious citations—arises due to LLMs' limited access to subscription-based databases and their reliance on probabilistic text generation. This letter discusses two key approaches to mitigating these issues. First, retrieval-augmented generation (RAG) combined with Hallucination Aware Tuning (HAT) improves citation integrity by integrating external databases and employing hallucination detection models. However, even RAG-HAT systems may still misinterpret source content. Second, we propose the development of “Reference-Accurate” Academic LLMs by major global publishers, which would be trained exclusively on rigorously verified academic literature, ensuring that all citations generated are authentic and traceable. We recommend a dual approach integrating RAG-HAT with publisher-backed academic LLMs, along with human oversight, to enhance AI-assisted scholarly communication. Future research should evaluate the accuracy and reliability of these methods to promote responsible AI use in academia.


 Citation

Please cite as:

Temsah MH, Al-Eyadhy A, Jamal A, Alhasan K, Malki KH

Authors’ Reply: Citation Accuracy Challenges Posed by Large Language Models

JMIR Med Educ 2025;11:e73698

DOI: 10.2196/73698

PMID: 40173373

PMCID: 12037898

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.