Accepted for/Published in: JMIR Infodemiology
Date Submitted: Aug 8, 2023
Date Accepted: Oct 24, 2024
Uncovering the Top Non-Advertising Weight Loss Websites on Google: A Data-mining Approach
ABSTRACT
Background:
Online weight loss information is commonly sought by internet users, and it may impact their health decisions and behaviors. Previous studies examined a limited number of Google search queries and relied on manual approaches to retrieve online weight loss websites.
Objective:
Therefore, the goal of this study was to identify and unveil the characteristics of the top weight loss websites on Google.
Methods:
This study gathered 432 Google search queries, collected from Google autocomplete suggestions, "People Also Ask" featured questions and Google Trends data. A data-mining software tool was developed to retrieve the search results automatically setting English and United States as the default criteria for language and location, respectively. Domain classification and evaluation technologies were used to categorize the websites according to their content and determine their risk of cyberattack. Also, the top five most frequent websites in non- advertising (i.e., non-sponsored) search results were inspected for quality.
Results:
The results revealed that the top five non-advertising websites were healthline.com, webmd.com, verywellfit.com, mayoclinic.org, and womenshealthmag.com. All provided accuracy statements and author credentials. The domain categorization taxonomy yielded a total of 101 unique categories. After grouping the websites that appeared less than five times, the most frequent categories involved "Health" (n = 104, 16.69%), "Personal Pages and Blogs" (n = 91, 14.61%), "Nutrition and Diet" (n = 48, 7.7%), and "Exercise" (n = 34, 5.46%). The risk of being a victim of a cyberattack was low.
Conclusions:
This study provided initial evidence that a data-mining software tool can help to identify the most common websites for weight loss. Clinical Trial: N/A
Citation
Request queued. Please wait while the file is being generated. It may take some time.