Accepted for/Published in: JMIR Formative Research
Date Submitted: Apr 22, 2022
Date Accepted: Jan 24, 2023
Online public ratings of general practitioners in Norway: validation study
ABSTRACT
Background:
The relationships between the multiple strategies that are being currently used for evaluating the performance of services from the perspective of users is complex yet crucial for their interpretation of measurements.
Objective:
The main objectives were to: a) evaluate the psychometric performance of a 11-item online questionnaire of ratings of general practitioners (GPs) currently used in Norway (“Legelisten.no”), and b) to assess the association between online and survey-based patient experience indicators.
Methods:
We included all published ratings on GPs and practices on Legelisten.no in the period 26/05/2012-15/12/2021 (n=76,521). The questionnaire consists of one mandatory and 10 voluntary items with five response categories (one to five stars), alongside an open-ended review question and background variables. Questionnaire dimensionality and internal consistency were assessed with Cronbach’s α, exploratory factor and item response theory (IRT) analyses and a priori hypotheses were developed for assessing construct validity (2). We calculated Spearman’s correlations between online ratings and reference patient experience indicators based on survey data using the Patient Experiences with GP Questionnaire (PEQ-GP) (n=5,623 respondents for a random sample of 50 GPs).
Results:
Online raters were predominantly women (64.0%), in the age range 20-50 (74.6%), and reporting 5 or less consultations with the GP each year (64.5%). Ratings were missing for 18.9% to 27.4% of non-mandatory items. Four of 11 rating items showed a U-shaped distribution, with >60% reporting five stars. Factor analysis and internal consistency testing identified two rating scales: “GP” (5 items; α=0.98) and “Practice” (6 items; α=0.85). Some associations were not consistent with a priori hypotheses and allowed only partial confirmation of construct validity of ratings. IRT analysis results were adequate for the “Practice” scale but not for the “GP” scale, with items with inflated discrimination (>5) distributed over a narrow interval of the scale. Correlations between online ratings GP scale and GP reference indicators ranged from 0.34 to 0.44 (p<0.05), while the correlation between online ratings Practice scale and reference indicators ranged from 0.17 (ns) to 0.49 (p<0.05). The strongest correlations between online and survey scores were found for items measuring practice related experiences: availability phone (ρ=0.51); waiting time in office (ρ=0.62), other staff (ρ=0.54, 0.58) (p<0.05).
Conclusions:
The practice scale of the online ratings has an adequate psychometric performance, while the GP suffers from important limitations. Available online ratings were derived from non-representative samples of the population with skewed and polarized views of GPs. The associations with survey-based patient experience indicators were accordingly mostly weak to modest. Our study underlines the importance of interpreting online ratings with caution and to further develop rating sites.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.