Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Aug 13, 2024
Date Accepted: Feb 17, 2025
Improving Systematic Review Updates with Natural Language Processing: A Study on Screening Model Efficiency Through Component Classification and Selection
ABSTRACT
Background:
One of the challenges in updating systematic reviews is the workload associated with screening the literature. Recently, many screening models employing natural language processing technology have been implemented to scrutinize literature based on titles and abstracts. However, these models may underperform because of the overfitting caused by excessive textual information. Selecting specific elements from abstracts could improve the performance of the model.
Objective:
Our study aimed to evaluate the efficacy of a novel screening model that selects specific elements from abstracts to improve the performance. We also aimed to develop an automatic systematic review update model using an element classifier to categorize abstracts based on their components.
Methods:
A screening model was created based on the included and excluded literature in the existing systematic review and used as the scheme for the automatic update of the systematic review. One previously published literature was selected for the systematic review, and literature included or excluded in the literature screening process was used as training data. The titles and abstracts of these literature were classified into five categories (Title: T, Introduction: I,
Methods:
M,
Results:
R, Conclusion: C). Thirty-one element-composition datasets were created by combining five element datasets. We implemented 31 screening models using the created element-composition datasets and compared their performances. Comparisons were conducted using three pre-trained models: BERT, BioLinkBERT, and BioM-ELECTRA. Moreover, to automate the element selection of abstracts, we developed the Abstract Element Classifier Model and created element datasets using this classifier model classification. Using the element datasets classified using the Abstract Element Classifier Model, we created 10 element-composition datasets used by the top 10 screening models with the highest performance when implementing screening models using the element datasets that were classified manually. Ten screening models were implemented using these datasets, and their performances were compared with those of models developed using the same manually classified element datasets. The primary evaluation metric was the F10-Score weighted by Recall.
Results:
A total of 256 included literature and 1,261 excluded literature were extracted from the selected systematic review. In the screening models implemented using manually classified datasets, the performance of some surpassed that of models trained on all elements (BERT: 9 models, BioLinkBERT: 6 models, and BioM-ELECTRA: 21 models). In models implemented using datasets classified by the Abstract Element Classifier Model, the performances of some models (BERT: 7 models; BioM-ELECTRA: 9 models) surpassed that of the models trained on all elements.
Conclusions:
Element selection from the title/abstract of the literature can improve the performance of screening models. This approach should be useful for reducing the workload for updating systematic reviews.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.