Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Public Health and Surveillance

Date Submitted: Jul 26, 2024
Date Accepted: Jul 31, 2025

The final, peer-reviewed published version of this preprint can be found here:

Correcting for the Inflated Adult Population Denominator in an English Nationwide Health Care Cohort: Database Analysis Study

Venkatesan S, Joy M, Jamie G, Kar D, Williams R, Fan X, Meeraus W, Tsang RS, Taylor K, Taylor S, Hobbs FR, Anand SN, Byford R, Robertson C, de Lusignan S

Correcting for the Inflated Adult Population Denominator in an English Nationwide Health Care Cohort: Database Analysis Study

JMIR Public Health Surveill 2025;11:e64788

DOI: 10.2196/64788

PMID: 41144579

PMCID: 12559012

Correcting for the inflated adult population denominator in an English nationwide healthcare cohort: Implications for real-world evidence research

  • Sudhir Venkatesan; 
  • Mark Joy; 
  • Gavin Jamie; 
  • Debasish Kar; 
  • Robert Williams; 
  • Xuejuan Fan; 
  • Wilhelmine Meeraus; 
  • Ruby SM Tsang; 
  • Kathryn Taylor; 
  • Sylvia Taylor; 
  • FD Richard Hobbs; 
  • Sneha N Anand; 
  • Rachel Byford; 
  • Chris Robertson; 
  • Simon de Lusignan

ABSTRACT

Background:

The digitally mature English National Health Service (NHS) hosts a national pandemic planning and research dataset, which when combined with the comprehensive provision of COVID-19 vaccination and emergency care make it an ideal country in which to study effectiveness of COVID-19 vaccines.The potential for differences in the size of the English population based on numbers registered with general practitioners compared with census data has been acknowledged previously. However, the full extent of the discrepancy is not understood.

Objective:

To report any differences between the GP-registered adult population size based on healthcare records compared to census estimates for England, and to apply methodology to correct for such differences.

Methods:

We compared the number of adult patients within the General Practice Extraction Service Data for Pandemic Planning and Research (GDPPR) with a valid general practitioner (GP) registration as of 1st October 2021, with estimates published by the Office for National Statistics (ONS) for the English population. We used an approach adapted from a weighting method to correct for non-response bias in surveys and down-weighted individuals with no evidence of recent activity in their records.

Results:

There were 61,194,033 registered NHS patients (in the GDPPR) compared with 56,550,138 in the ONS census-based population. De-duplication on NHS number reduced the population to 57,876,641 including 46,835,968 adults, with the biggest overrepresented group aged 30–45 years. Of the 46,835,986, 1,121,954 (2.4%) individuals had their initial weights down-weighted due to non-engagement with the healthcare system since January 2019. The down-weighting removed most of the differences between NHS and ONS populations.

Conclusions:

There are notable differences in the adult population size as per GDPPR when compared to census estimates. While the overall population size in the GDPPR data was seen to be inflated when compared to ONS census estimates, this was differential with respect to sociodemographic variables. A weighting-based approach can be applied to correct for the inflated denominator. Not correcting for it in the English NHS data could introduce selection bias.


 Citation

Please cite as:

Venkatesan S, Joy M, Jamie G, Kar D, Williams R, Fan X, Meeraus W, Tsang RS, Taylor K, Taylor S, Hobbs FR, Anand SN, Byford R, Robertson C, de Lusignan S

Correcting for the Inflated Adult Population Denominator in an English Nationwide Health Care Cohort: Database Analysis Study

JMIR Public Health Surveill 2025;11:e64788

DOI: 10.2196/64788

PMID: 41144579

PMCID: 12559012

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.