Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Nov 17, 2023
Date Accepted: Jul 23, 2024
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Accelerating Evidence Synthesis in Observational Studies: A Living NLP-Assisted Intelligent Systematic Literature Review System
ABSTRACT
Background:
Systematic literature review (SLR), a robust method to identify and summarize evidence from published sources, is considered as a complex, time-consuming, labor-intensive and expensive task.
Objective:
To present a solution based on Natural Language Processing (NLP) that accelerates and streamlines the SLR process for observational studies using real world data.
Methods:
We followed an agile software development and iterative software engineering methodology to build a customized intelligent end-to-end living NLP-assisted solution for observational SLR tasks. Multiple machine learning-based NLP algorithms were adopted to automate article screening and data element extraction processes. The NLP prediction results can be further reviewed and verified by domain experts, following the human-in-the-loop design. The system integrates Explainable AI (XAI) to provide evidence to NLP algorithms and add transparency to extracted literature data elements. The system was developed based on three existing SLR projects of observational studies, including the epidemiology studies of human papillomavirus-associated diseases, the disease burden of pneumococcal diseases, and cost-effectiveness studies of pneumococcal vaccines.
Results:
Our Intelligent SLR Platform, covers major SLR steps, including study protocol setting, literature retrieval, abstract screening, full-text screening, data element extraction from full-text articles, results summary, and data visualization. The NLP algorithms have achieved 0.86 to 0.90 accuracy scores on article screening tasks (framed as text classification tasks) and 0.57 to 0.89 macro-average F1 scores on data element extraction tasks (framed as named entity recognition tasks).
Conclusions:
Cutting-edge NLP algorithms expedite SLR for observational studies, thus allowing scientists to have more time to focus on the quality of data and the synthesis of evidence in observational studies. Aligning the living systematic literature review concept, the system has the potential to update literature data and enable scientists to easily stay current with the literature related to observational studies prospectively and continuously.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.