Accepted for/Published in: JMIR Research Protocols
Date Submitted: Mar 5, 2024
Date Accepted: Jun 4, 2024
Combining Federated Machine Learning and Qualitative Methods to Investigate Novel Pediatric Asthma Subtypes: Protocol for a Mixed Methods Study
ABSTRACT
Background:
Pediatric asthma is a heterogeneous disease; however, current characterizations of its subtypes are limited. Machine learning (ML) methods are well suited for identifying subtypes. In particular, deep neural networks can learn patient representations by leveraging longitudinal information captured in electronic health records (EHRs) while considering future outcomes. However, the traditional approach for subtype analysis requires large amounts of EHR data, which may contain protected health information (PHI) causing potential concerns regarding patient privacy. Federated learning is the key technology to address privacy concerns while preserving the accuracy and performance of ML algorithms. Federated learning could enable multi-site development and implementation of ML algorithms to facilitate the translation of artificial intelligence into clinical practice.
Objective:
To develop a research protocol for implementation of federated ML across a large clinical research network to identify and discover pediatric asthma subtypes and their progression over time.
Methods:
We will develop a research-grade pediatric asthma computable phenotype and clinical natural language processing pipeline. We will then apply federated learning to characterize pediatric asthma subtypes and their temporal progression. Focus groups with practicing pediatric asthma clinicians will be interwoven to investigate the clinical utility of the subtypes.
Results:
OneFlorida+ data from 2011 to 2023 contained 411,628 patients aged 2–18 years and 11,156,148 clinical notes.
Conclusions:
Pediatric asthma subtypes incorporating real-world data (RWD) from diverse populations could improve patient outcomes by moving the field closer to precision pediatric asthma care. Our privacy-preserving federated learning methodology and qualitative implementation work will address several challenges of applying ML to large, multicenter RWD data. Clinical Trial: Not applicable
Citation
Per the author's request the PDF is not available.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.