JMIR Preprints #52462: Automated Category and Trend Analysis of Scientific Articles Using Large Language Models (LLMs): An Application in Ophthalmology

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Automated Category and Trend Analysis of Scientific Articles Using Large Language Models (LLMs): An Application in Ophthalmology

Hina Raja;
Asim Munawar;
Nikolaos Mylonas;
Mohammad Delsoz.;
Yeganeh Madadi;
Mohammad Elahi;
Amr Hassan;
Hashem Abu Serhan;
Onur Inam;
Luis Hernandez;
Hao Chen;
Sang Tran;
Wuqas Munir;
Alaa Abd-Alrazaq;
Siamak Yousefi.

ABSTRACT

Background:

Objective:

In this paper, we present an automated method for article classification, leveraging the power of Large Language Models (LLM). The primary focus is on the field of ophthalmology, but the model is extendable to other fields.

Methods:

We have developed a model based on Natural Language Processing (NLP) techniques, including advanced LLMs, to process and analyze the textual content of scientific papers. Specifically, we have employed zero-shot learning (ZSL) LLM models and compared against Bidirectional and Auto-Regressive Transformers (BART) and its variants, and Bidirectional Encoder Representations from Transformers (BERT), and its variant such as distilBERT, SciBERT, PubmedBERT, BioBERT.

Results:

The classification results demonstrate the effectiveness of LLMs in categorizing the large number of ophthalmology papers without human intervention. To evaluate the LLMs, we compiled a dataset (RenD) of 1000 ocular disease-related articles, which were expertly annotated by a panel of six specialists into 15 distinct categories. The model achieved a mean accuracy of 0.86 and a mean F1 of 0.85 based on the RenD dataset.

Conclusions:

The proposed framework achieves notable improvements in both accuracy and efficiency. Its application in the domain of ophthalmology showcases its potential for knowledge organization and retrieval in other domains too. We performed trend analysis that enables the researchers and clinicians to easily categorize and retrieve relevant papers, saving time and effort in literature review and information gathering as well as identification of emerging scientific trends within different disciplines. Moreover, the extendibility of the model to other scientific fields broadens its impact in facilitating research and trend analysis across diverse disciplines.

Citation

Please cite as:

Raja H, Munawar A, Mylonas N, Delsoz. M, Madadi Y, Elahi M, Hassan A, Abu Serhan H, Inam O, Hernandez L, Chen H, Tran S, Munir W, Abd-Alrazaq A, Yousefi. S

Automated Category and Trend Analysis of Scientific Articles on Ophthalmology Using Large Language Models: Development and Usability Study

JMIR Form Res 2024;8:e52462

DOI: 10.2196/52462

PMID: 38517457

PMCID: 10998173

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Formative Research

Date Submitted: Sep 4, 2023

Open Peer Review Period: Sep 4, 2023 - Oct 30, 2023

Date Accepted: Feb 2, 2024

(closed for review but you can still tweet)

Automated Category and Trend Analysis of Scientific Articles Using Large Language Models (LLMs): An Application in Ophthalmology

ABSTRACT

Citation

Copyright