Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Feb 28, 2024
Date Accepted: Oct 27, 2024
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Applying Artificial Intelligence in Structured Real-World Data for Pharmacovigilance Purposes: A Systematic Literature Review
ABSTRACT
Background:
Artificial Intelligence (AI) on real-world data (RWD) (e.g., Electronic Healthcare Records – EHR) has been identified as a potentially promising technical paradigm for the pharmacovigilance (PV) field. There are several applications of AI approaches on RWD, however, most of the studies focus on unstructured RWD, i.e., conducting Natural Language Processing (NLP) on various data sources (e.g., clinical notes, social media, blogs, etc.). Hence, it is essential to investigate how AI is already applied to structured RWD in PV and how new approaches could enrich the existing methodology.
Objective:
This manuscript provides a Systematic Literature Review (SLR) depicting the emerging use of AI upon structured RWD for PV purposes to identify relevant trends and potential research gaps.
Methods:
The presented SLR methodology is based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) methodology/rationale. Relevant scientific manuscripts were retrieved by PubMed on January 31, 2024. The included studies were “mapped” against a set of evaluation criteria, including applied AI approaches, code availability, description of data preprocessing pipeline, implementation of trustworthy AI criteria, and clinical validation of AI models.
Results:
The systematic literature review finally yielded 36 studies. There has been a significant increase in studies after 2019. Most of the articles focus on Adverse Drug Reaction (ADR) detection procedures (64%) for specific adverse effects. Furthermore, a significant number of studies (>90%) used non-symbolic AI approaches (Machine Learning – ML and Deep Learning - DL) emphasizing classification tasks. Random forest is the most popular ML approach in this review (47%). The most common RWD sources used are the EHRs (78%). Typically, these data are not available in a widely acknowledged data model to facilitate interoperability and they come from proprietary databases; thus, they are not available to reproduce results. Based on the evaluation criteria classification, 10% of the studies published their code in public registries, 16% of them tested their AI models in clinical environments and 36% of them provided information about the data preprocessing pipeline. Additionally, in terms of trustworthy AI, 89% of the articles follow at least half of the FUTURE AI initiative guidelines.
Conclusions:
Artificial intelligence, along with structured real-world data, constitutes a new and promising line of work for drug safety and PV. However, in terms of AI, some approaches haven’t been examined extensively in this field (like Explainable AI and Causal AI). Moreover, it would be helpful to have a data preprocessing protocol for real-world data to support pharmacovigilance processes. Finally, because of personal data sensitivity, evaluation procedures have to be investigated further.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.