Accepted for/Published in: JMIR Public Health and Surveillance
Date Submitted: Jul 26, 2024
Date Accepted: Jul 31, 2025
Correcting for the inflated adult population denominator in an English nationwide healthcare cohort: Implications for real-world evidence research
ABSTRACT
Background:
The digitally mature English National Health Service (NHS) hosts a national pandemic planning and research dataset, which when combined with the comprehensive provision of COVID-19 vaccination and emergency care make it an ideal country in which to study effectiveness of COVID-19 vaccines.The potential for differences in the size of the English population based on numbers registered with general practitioners compared with census data has been acknowledged previously. However, the full extent of the discrepancy is not understood.
Objective:
To report any differences between the GP-registered adult population size based on healthcare records compared to census estimates for England, and to apply methodology to correct for such differences.
Methods:
We compared the number of adult patients within the General Practice Extraction Service Data for Pandemic Planning and Research (GDPPR) with a valid general practitioner (GP) registration as of 1st October 2021, with estimates published by the Office for National Statistics (ONS) for the English population. We used an approach adapted from a weighting method to correct for non-response bias in surveys and down-weighted individuals with no evidence of recent activity in their records.
Results:
There were 61,194,033 registered NHS patients (in the GDPPR) compared with 56,550,138 in the ONS census-based population. De-duplication on NHS number reduced the population to 57,876,641 including 46,835,968 adults, with the biggest overrepresented group aged 30–45 years. Of the 46,835,986, 1,121,954 (2.4%) individuals had their initial weights down-weighted due to non-engagement with the healthcare system since January 2019. The down-weighting removed most of the differences between NHS and ONS populations.
Conclusions:
There are notable differences in the adult population size as per GDPPR when compared to census estimates. While the overall population size in the GDPPR data was seen to be inflated when compared to ONS census estimates, this was differential with respect to sociodemographic variables. A weighting-based approach can be applied to correct for the inflated denominator. Not correcting for it in the English NHS data could introduce selection bias.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.