Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Jul 10, 2020
Date Accepted: Oct 29, 2020
Date Submitted to PubMed: Oct 31, 2020
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Structuring ontology terms automatically based on lexical granularity and machine learning
ABSTRACT
Background:
As the manual creation and maintenance of biomedical ontologies are labor-intensive, automatic aids are desirable in the life cycle of ontology development.
Objective:
In this study, provided with a set of concept names in the Foundational Model of Anatomy (FMA), we aim to propose an innovative method for automatically generating the taxonomy and the partonomy structures among them, respectively.
Methods:
Our approach comprises two main tasks: The first task is predicting the direct relation between two given concept names, by utilizing word embedding and training machine learning models Convolutional Neural Networks (CNN) and Bidirectional Long Short-Term Memory Networks (Bi-LSTM). The second task is introducing an original granularity-based method to identify the semantic structures among a group of given concept names, by leveraging the trained models above.
Results:
Results show that both CNN and Bi-LSTM perform well on the first task, with F1 measures above 0.91. For the second task, our approach achieves the average F1 measure of 0.79 on 100 case studies in FMA using Bi-LSTM, which outperforms the primitive pairwise-based method.
Conclusions:
In conclusion, we investigate an automatic way to predict the hierarchical relation between two concept names, based on which, we further invent a methodology to structure a group of concept names automatically. This study is an initial investigation that will shed light on further work on automatic aids in the ontology lifecycle.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.