Currently submitted to: JMIR Research Protocols
Date Submitted: May 7, 2026
Open Peer Review Period: May 11, 2026 - Jun 18, 2026
(closed for review but you can still tweet)
NOTE: This is an unreviewed Preprint
Warning: This is a unreviewed preprint (What is a preprint?). Readers are warned that the document has not been peer-reviewed by expert/patient reviewers or an academic editor, may contain misleading claims, and is likely to undergo changes before final publication, if accepted, or may have been rejected/withdrawn (a note "no longer under consideration" will appear above).
Peer review me: Readers with interest and expertise are encouraged to sign up as peer-reviewer, if the paper is within an open peer-review period (in this case, a "Peer Review Me" button to sign up as reviewer is displayed above). All preprints currently open for review are listed here. Outside of the formal open peer-review period we encourage you to tweet about the preprint.
Citation: Please cite this preprint only for review purposes or for grant applications and CVs (if you are the author).
Final version: If our system detects a final peer-reviewed "version of record" (VoR) published in any journal, a link to that VoR will appear below. Readers are then encourage to cite the VoR instead of this preprint.
Settings: If you are the author, you can login and change the preprint display settings, but the preprint URL/DOI is supposed to be stable and citable, so it should not be removed once posted.
Submit: To post your own preprint, simply submit to any JMIR journal, and choose the appropriate settings to expose your submitted version as preprint.
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
A multisource, multilevel contextual database to support Ending the HIV Epidemic initiative: Protocol for design, construction, harmonization, and quality assurance
ABSTRACT
Background:
The Ending the HIV Epidemic (EHE) initiative remains a national priority in the United States (U.S.), aiming to reduce new HIV infections by 90% by 2030. As we cross the initiative’s midpoint, there has been a renewed commitment to strengthening the HIV workforce’s capacity to plan, implement, and sustain effective HIV prevention, treatment, and care interventions. Despite substantial improvements in HIV outcomes, uneven implementation of evidence-based interventions reflects persistent gaps between available evidence and its translation into locally actionable practice. Achieving EHE goals requires tailoring implementation to the diverse epidemiological, social, and structural conditions shaping HIV outcomes across jurisdictions. Research increasingly highlights the value of integrated, contextual data to strengthen public health decision-making. Linking indicators spanning multiple conceptual domains across regional, local and individual levels can support a more robust understanding of the distinct drivers of HIV outcomes, yet existing data systems remain fragmented across domains and scales. A harmonized, multisource, multilevel database is therefore essential to support targeted, needs-based and data-driven implementation under the EHE initiative.
Objective:
This project has two objectives: (1) to build a high-quality contextual database integrating multiple sources of public data using transparent, replicable, and updateable methods, and (2) to develop and document systematic workflows for ongoing database updates, quality assurance, and to support future use aligned with open-science frameworks and standard data practices.
Methods:
This project will follow best practices in data architecture, acquisition, standardization, and quality assurance. For Objective 1, we will integrate data across multiple geographic levels (e.g., ZIP code, county) for the years 2020-2025, with measures categorized into conceptual domains (e.g., epidemiologic, sociodemographic) guided by established theoretical frameworks to facilitate future analyses. For Objective 2, we will develop a tiered data structure to enable transparent and reproducible data management, using a GitHub repository to store all documentation, processing scripts, and quality assurance logs to align with open science practices. Database construction and quality assurance methods were informed by targeted literature reviews in PubMed. Data sources will be identified from three inputs: existing data repositories, datasets identified through targeted literature reviews, and reports or grey-literature with consistent formatting and permissive terms of use suitable for web scraping. Stakeholder engagement will be integrated through all phases of database development, informing variable selection, usability, and validation to enable iterative refinement and revision.
Results:
Literature reviews were conducted from October to November of 2025, to inform database construction methods, source identification, and protocol development. Data acquisition will begin in May 2026.
Conclusions:
This contextual database will provide a reproducible and scalable data resource to support public health planning and advance implementation science by enabling more context-responsive decision-making under the EHE initiative.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.