Accepted for/Published in: JMIR mHealth and uHealth
Date Submitted: Feb 10, 2024
Date Accepted: May 26, 2025
Grouping Digital Health Apps Based on Their Quality and User Ratings Using K-Medoids Clustering: A Cross-sectional Study
ABSTRACT
Background:
Digital health apps allow for proactive rather than reactive healthcare and have the potential to take the pressure off healthcare providers. With over 350,000 digital health apps available on the app stores today, those apps need to be of sufficient quality to be safe to use. Discovering the typology of digital health apps regarding professional/clinical assurance, user experience, data privacy and user rating may help in determining the areas where digital health apps can improve.
Objective:
This study had two objectives: 1) Discover the types of digital health apps with regards to their quality (scores) across three domains (their professional/clinical assurance, user experience and data privacy) and user ratings. 2) Determine whether the NICE ESF tier, target users of the digital health apps, categories or features have any association with this typology.
Methods:
This study has been conducted using data from 1402 digital health app assessments. Each app was assessed using the ORCHA baseline review (OBR), regarding the app’s professional/clinical assurance, user experience and data privacy. K-medoids clustering was used with this data to discover a typology of digital health apps. The number of clusters has been determined using the elbow method. Shapiro-Wilk test has been used to check if the user ratings or the OBR scores are normally distributed. Following the results of the Shapiro-Wilk test, the unpaired two-samples Wilcoxon test has been used to compare corresponding user ratings and the OBR scores among clusters. Post hoc analysis has been conducted by counting the prevalence of each target users, categories and features in each cluster. Fisher exact test (p-value<.05, adjusted with Bonferroni corrected alpha value) has been used to determine whether the difference in proportion was statistically significant among the clusters and the effect size has been determined using Cohen’s W.
Results:
Four clusters have been discovered, which have been labelled: 1) Apps (n=220) with poor user ratings, 2) Apps (n=252) with poor PCA/DP scores, 3) Apps (n=415) with poor PCA scores and 4) Higher quality apps (n=515) with higher user ratings. Post hoc analysis with the Fisher exact test achieved statistically significance (using a Bonferroni alpha value) for, NICE ESF tiers (2/3), target users (0/14), categories (4/33) and features (6/19) when comparing smallest and largest cluster with relative percentage of prevalence. Cohen W was <.2 for all NICE ESF tiers, target users, categories and features.
Conclusions:
The principal finding of the analysis were: 1) The most common digital health apps are those with high user ratings and high OBR quality scores (36.7%). 2) There are many digital health apps (29.6%) that lack professional/clinical assurance but excel in user rating, user experience and data privacy. 3) User rating is not indicative of OBR quality assessment scoring; digital health apps can receive high user ratings and low OBR scores and vice versa. NICE ESF tiers, target users, categories, and features, by and large, seem to have no association with clusters.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.