Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Jun 18, 2020
Date Accepted: Mar 7, 2021
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Distributed Regression Analysis with Vertically Partitioned Data facilitated by PopMedNet: A Feasibility and Enhancement Study
ABSTRACT
Background:
In clinical research important variables may be collected in multiple data sources. Physical pooling of patient-level data across multiple sources often raises several challenges, including proper protection of patient privacy and proprietary interest. We previously developed a SAS-based package to perform distributed regression analysis (DRA), a privacy-protecting method that performs multivariable-adjusted regression analysis with only summary-level information, with horizontally partitioned data, a setting where distinct cohorts of patients are available in different data sources. We integrated the package with PopMedNet, an open-source file transfer software, to facilitate information exchange between the analysis center and the data-contributing sites. The feasibility of using PopMedNet to facilitate DRA with vertically partitioned data, a setting where the data attributes of a cohort of patients are available in different data sources, was unknown.
Objective:
To describe the feasibility of and enhancements to PopMedNet to facilitate vertical DRA (vDRA) in the real-world setting.
Methods:
We gathered the statistical and informatic requirements of using PopMedNet to facilitate automatable vDRA. We implemented enhancements to PopMedNet to improve its technical capability to facilitate vDRA.
Results:
PopMedNet can enable automatable vDRA. We identified and implemented two enhancements to PopMedNet that improved its technical capability to perform automatable vDRA in the real-world setting. The first was the ability to simultaneously upload and download multiple files, and second was the ability to directly transfer summary-level information between the data sources without a semi-trusted third party.
Conclusions:
PopMedNet can be used to facilitate automatable vDRA to protect patient privacy and support clinical research discoveries in the real-world setting.
Citation