Accepted for/Published in: JMIR AI
Date Submitted: Feb 23, 2023
Open Peer Review Period: Feb 23, 2023 - Apr 20, 2023
Date Accepted: Oct 9, 2023
(closed for review but you can still tweet)
Machine Learning-based Asthma Attack Prediction Models from Routinely Collected Electronic Health Records: A Systematic Scoping Review
ABSTRACT
Background:
An early warning tool to predict attacks could enhance asthma management and reduce the likelihood of serious consequences. Electronic health records (EHRs) providing access to historical data from asthma patients coupled with machine learning (ML) provide an opportunity to develop such a tool. There have been several studies that developed ML-based tools to predict asthma attacks.
Objective:
To critically evaluate ML-based models derived using EHRs for the prediction of asthma attacks.
Methods:
We systematically searched PubMed and SCOPUS, between January 1, 2012, and January 31, 2023, for articles meeting the following inclusion criteria: 1) used EHR data as the main data sources; 2) used asthma attack as the outcome and; 3) compared ML-based prediction models performance. We excluded non-English articles and non-research articles, such as commentary, and systematic review papers. Additionally, we also excluded papers that did not provide any details on the respective ML approach and its result, including protocol papers. The selected studies were then summarised across multiple dimensions including data pre-processing methods, ML algorithms, model validation, model explainability, and model implementation.
Results:
Seventeen articles were included at the end of the selection process. There was considerable heterogeneity in how asthma attacks were defined. Eight studies used routinely collected data both from primary care and secondary care practices together. Extreme imbalanced data was a notable issue in most studies (76%), but only five of them explicitly dealt with it in their data pre-processing pipeline. The gradient boosting-based method was the best ML method in 10 out of 17 studies. Fourteen studies employed a model explanation method to identify the most important predictors. None of the studies followed standard reporting guidelines and none were prospectively validated.
Conclusions:
Our review indicates that this research field is still underdeveloped given the limited body of evidence, heterogeneity of methods, the lack of external validation and sub-optimally reported models. We highlighted several technical challenges (class imbalance, external validation, model explanation, adherence to reporting guidelines to aid reproducibility) that need to be addressed to make progress toward clinical adoption.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.