Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR AI

Date Submitted: Nov 10, 2024
Date Accepted: Jun 28, 2025

The final, peer-reviewed published version of this preprint can be found here:

Identification and Categorization of the Top 100 Articles and the Future of Large Language Models: Thematic Analysis Using Bibliometric Analysis

Bernstein E, Ramsamooj A, Millar KL, Lum ZC

Identification and Categorization of the Top 100 Articles and the Future of Large Language Models: Thematic Analysis Using Bibliometric Analysis

JMIR AI 2025;4:e68603

DOI: 10.2196/68603

PMID: 40864888

PMCID: 12384689

After One Year, Where are Large Language Models Headed: A Thematic Analysis using Bibliometric Methodology

  • Ethan Bernstein; 
  • Anya Ramsamooj; 
  • Kelsey Leann Millar; 
  • Zachary C Lum

ABSTRACT

Background:

Since the release of ChatGPT and other large language models (LLMs), there has been a significant increase in academic publications exploring their capabilities and implications across various fields, such as Medicine, Education, and Technology.

Objective:

This study aims to identify the most influential academic works on LLMs published in the past year, categorize their research types and thematic focuses, within different professional fields. The study also evaluates the ability of AI tools, such as ChatGPT, to accurately classify academic research.

Methods:

We conducted a bibliometric analysis using Clarivate’s Web of Science (WOS) to extract the top 100 most cited articles on LLMs. Articles were manually categorized by field, journal, author, and research type. ChatGPT-4 was used to generate categorizations for the same articles, and its performance was compared to human classifications. Statistical analyses were performed to determine the prevalence of research fields and to evaluate the accuracy of AI-generated classifications.

Results:

Medicine emerged as the predominant field among the top-cited articles, accounting for 43%, followed by Education (26%) and Technology (15%). Medical literature primarily focused on clinical applications of LLMs, limitations of AI in healthcare, and the role of AI in medical education. In Education, research was centered around ethical concerns and potential applications of AI for teaching and learning. ChatGPT demonstrated high concordance with human reviewers, achieving an agreement rating of 86% for research types and 92% for fields of study.

Conclusions:

While LLMs like ChatGPT exhibit considerable potential in aiding research categorization, human oversight remains essential to address issues such as hallucinations, outdated information, and biases in AI-generated outputs. This study highlights the transformative potential of LLMs across multiple sectors and emphasizes the importance of continuous ethical evaluation and iterative improvement of AI systems to maximize their benefits while minimizing risks.


 Citation

Please cite as:

Bernstein E, Ramsamooj A, Millar KL, Lum ZC

Identification and Categorization of the Top 100 Articles and the Future of Large Language Models: Thematic Analysis Using Bibliometric Analysis

JMIR AI 2025;4:e68603

DOI: 10.2196/68603

PMID: 40864888

PMCID: 12384689

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.