Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Jul 10, 2020
Date Accepted: Oct 29, 2020
Date Submitted to PubMed: Oct 31, 2020

The final, peer-reviewed published version of this preprint can be found here:

Automatic Structuring of Ontology Terms Based on Lexical Granularity and Machine Learning: Algorithm Development and Validation

Luo L, Feng J, Yu H, Wang J

Automatic Structuring of Ontology Terms Based on Lexical Granularity and Machine Learning: Algorithm Development and Validation

JMIR Med Inform 2020;8(11):e22333

DOI: 10.2196/22333

PMID: 33127601

PMCID: 7725650

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Structuring ontology terms automatically based on lexical granularity and machine learning

  • Lingyun Luo; 
  • Jingtao Feng; 
  • Huijun Yu; 
  • Jiaolong Wang

ABSTRACT

Background:

As the manual creation and maintenance of biomedical ontologies are labor-intensive, automatic aids are desirable in the life cycle of ontology development.

Objective:

In this study, provided with a set of concept names in the Foundational Model of Anatomy (FMA), we aim to propose an innovative method for automatically generating the taxonomy and the partonomy structures among them, respectively.

Methods:

Our approach comprises two main tasks: The first task is predicting the direct relation between two given concept names, by utilizing word embedding and training machine learning models Convolutional Neural Networks (CNN) and Bidirectional Long Short-Term Memory Networks (Bi-LSTM). The second task is introducing an original granularity-based method to identify the semantic structures among a group of given concept names, by leveraging the trained models above.

Results:

Results show that both CNN and Bi-LSTM perform well on the first task, with F1 measures above 0.91. For the second task, our approach achieves the average F1 measure of 0.79 on 100 case studies in FMA using Bi-LSTM, which outperforms the primitive pairwise-based method.

Conclusions:

In conclusion, we investigate an automatic way to predict the hierarchical relation between two concept names, based on which, we further invent a methodology to structure a group of concept names automatically. This study is an initial investigation that will shed light on further work on automatic aids in the ontology lifecycle.


 Citation

Please cite as:

Luo L, Feng J, Yu H, Wang J

Automatic Structuring of Ontology Terms Based on Lexical Granularity and Machine Learning: Algorithm Development and Validation

JMIR Med Inform 2020;8(11):e22333

DOI: 10.2196/22333

PMID: 33127601

PMCID: 7725650

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.