JMIR Preprints #70754: Quantifying innovation in stroke: a large language model bibliometric analysis

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Quantifying innovation in stroke: a large language model bibliometric analysis

Adam Marcus;
Georgina Lockwood-Taylor;
Daniel Rueckert;
Paul Bentley

ABSTRACT

Background:

Thrombolysis and mechanical thrombectomy represent the most successful stroke innovations over the last thirty years. Quantifying innovation in stroke is essential for identifying productive research lines and prioritizing funding but healthcare lacks validated methods for measuring innovation.

Objective:

This study aims to systemically evaluate the relationship between stroke-related patents and publications, demonstrate the feasibility of using large language models (LLMs) in this process, and identify the most rapidly advancing innovations in stroke care.

Methods:

Electronic patenting and research publication databases were searched between 1993 and 2023 for “stroke” OR “cerebrovascular”. A large language model (LLM) was trained to identify patents related to stroke disease, as opposed to other references to the word ‘stroke’, on a manually labeled subset of patents; and assessed using cross-validation. The LLM filtered irrelevant results, and the resulting patent codes were grouped into innovation clusters. Cluster-specific growth curves were plotted to analyze the rates and characteristics of growth. Adoption percentages for each innovation cluster were estimated by fitting a sigmoid curve to the patent and publication data.

Results:

The cross-validated accuracy of the LLM was 99.2%. An initial bibliometric search retrieved 237,035 patents and 486,664 research publications. After LLM filtering, 28,225 stroke-related patents remained, resulting in seven innovation clusters. Pharmacological treatments were the top-performing cluster over the last thirty years, accounting for approximately half of all patents. Artificial intelligence (AI) methods, rehabilitation devices, and medical imaging exhibited exponential rates of patent growth, with annual normalized increases of 39.2%, 15.9%, and 5.8%, compared to 16.9%, 5.3%, and 2.2% for publications, respectively. These innovations were identified as the most rapidly advancing in stroke management.

Conclusions:

The study shows how LLMs, applied to publicly available patent and publication data, can quantify innovation in stroke, and identify areas of greatest traction.

Citation

Please cite as:

Marcus A, Lockwood-Taylor G, Rueckert D, Bentley P

Quantifying Innovation in Stroke: Large Language Model Bibliometric Analysis

J Med Internet Res 2026;28:e70754

DOI: 10.2196/70754

PMID: 41558024

PMCID: 12869152

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Jan 1, 2025

Open Peer Review Period: Jan 1, 2025 - Feb 26, 2025

Date Accepted: Dec 23, 2025

(closed for review but you can still tweet)

Quantifying innovation in stroke: a large language model bibliometric analysis

ABSTRACT

Citation

Copyright