Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Mar 10, 2022
Date Accepted: Jun 27, 2022
A syntactic information-based classification model for medical literature: algorithm development and validation study
ABSTRACT
Background:
The ever-increasing volume of medical literature necessitates the classification of medical literature. Medical relation extraction is a typical method of classifying a large volume of medical literature. With the development of arithmetic power, medical relation extraction models have evolved from rule-based models to neural network models. The single neural network model discards the shallow syntactic information while discarding the traditional rules. Therefore, we propose a syntactic information-based classification model that complements and equalizes syntactic information to enhance the model.
Objective:
We aim to complete a syntactic information-based relation extraction model for more efficient medical literature classification.
Methods:
We devised two methods for enhancing syntactic information in the model. First, we introduced shallow syntactic information into the convolutional neural network to enhance non-local syntactic interactions. Secondly, we devise a cross-domain pruning method to equalize local and non-local syntactic interactions.
Results:
We experimented with three datasets related to the classification of medical literature. The F1 values were 65.5% and 91.5% on the CPR and PGR datasets, and the accuracy was 88.7% on the PubMed dataset. Our model outperforms the current state-of-the-art baseline model in the experiments.
Conclusions:
Our model based on syntactic information effectively enhances the medical relation extraction. Furthermore, the results of the experiments show that shallow syntactic information helps obtain non-local interaction in sentences and effectively reinforces syntactic features. It also provides new ideas for future research directions.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.