Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Mar 2, 2021
Open Peer Review Period: Mar 2, 2021 - Apr 27, 2021
Date Accepted: Mar 15, 2022
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Types of Errors Hiding in Google Scholar Data

Sauvayre R

Types of Errors Hiding in Google Scholar Data

J Med Internet Res 2022;24(5):e28354

DOI: 10.2196/28354

PMID: 35622395

PMCID: 9187964

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

What types of errors are hiding in Google Scholar data? Methodological concerns

  • Romy Sauvayre

ABSTRACT

Background:

Google Scholar (GS) is a free tool that may be used by researchers to analyze citations, to find appropriate literature or to evaluate the quality of an author or a contender for tenure, promotion, a faculty position, funding or research grants. GS has become a major bibliographic and citation database. Following the literature, databases such as PubMed, PsycINFO, Scopus or Web of Science can be used in place of GS because they are more reliable.

Objective:

The aim of this study is to examine the accuracy of citation data collected from GS and provide a comprehensive description of the errors and miscounts identified.

Methods:

281 documents that cited two specific works were retrieved from the Publish or Perish software and examined. This work studied the false positive issue inherent in the analysis of neuroimaging data.

Results:

The results reveal an unprecedented error rate: 99.3% of the references examined contain at least one error. Consequently, Google Scholar data not only fail to be accurate but also potentially expose those researchers who would use these data without verification to substantial biases in their analyses and results.

Conclusions:

Google Scholar data not only fail to be accurate but also potentially expose those researchers who would use these data without verification to substantial biases in their analyses and results.


 Citation

Please cite as:

Sauvayre R

Types of Errors Hiding in Google Scholar Data

J Med Internet Res 2022;24(5):e28354

DOI: 10.2196/28354

PMID: 35622395

PMCID: 9187964

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.