JMIR Preprints #28212: Matching Biomedical Ontologies: Clues, Approach, and Scalability

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Matching Biomedical Ontologies: Clues, Approach, and Scalability

Peng Wang;
Yunyan Hu;
Shaochen Bai;
Shiyi Zou

ABSTRACT

Background:

Ontology matching seeks to find semantic correspondences between ontologies. With more and more biomedical ontologies are developed independently and have overlapping, matching these ontologies has become a critical task in many biomedical applications. However, there still exists some challenges in matching biomedical ontologies. First, constructing matching clues based on biomedical ontology information is a non-trivial problem. Second, it is unknown that whether there are dominant matchers during matching biomedical ontologies. Finally, it also suffers from the computational complexity owing to the large-scale sizes of biomedical ontologies.

Objective:

The interoperability between biomedical ontologies is critically important, however, due to the natural heterogeneity and large scale size of biomedical ontologies, it is still very difficult to efficiently find alignments between ontologies. This paper aims to explore matching clues and empirically study the influence of various combination strategies of clues on biomedical ontology alignments. Besides, extended reduction anchors are introduced to effectively decrease the time complexity during matching large biomedical ontologies.

Methods:

In this paper, we first construct atomic and composite matching clues from four dimensions: terminology, structure, external knowledge, and representation learning. Then we present a spectrum of matchers based on matching clues and comprehensively investigate the effectiveness of them. In addition, we also carry out a systematic comparative evaluation of different combinations of matchers. Finally, extended reduction anchors are proposed to effectively reduce the time complexity for matching large scale biomedical ontologies.

Results:

The experimental results show that considering distinguishable matching clues in biomedical ontologies leads to a substantial improvement in F-measure over using all available information. And incorporating different types of matchers with reliability also leads to a marked improvement which is comparative to the state-of-the-art methods, and the dominant matchers achieve F1 score of 0.9271 for Anatomy, 0.8218 for FMA-NCI, and 0.50 for FMA-SNOMED respectively. Extended reduction anchors are able to resolve the scalability problem of matching large biomedical ontologies and achieves a significant reduction of time complexity with little loss in F1 measure at the same time, with 0.21% decrease in Anatomy and 0.84% decrease in FMA-NCI while 2.65% increase in FMA-SNOMED.

Conclusions:

We have systematically investigated and compared the effectiveness of different matching clues, matchers, and combination strategies. Our empirical study demonstrates that distinguishing clues perform better than using the all clues available in ontologies during matching biomedical ontologies. In contrast to the matchers with single clue, the matchers combining multiple clues have more stable and accurate performance. In addition, our results provide evidence that the approach based on extended reduction anchors performs well for large ontology matching task, demonstrating an effective solution for the problem.

Citation

Please cite as:

Wang P, Hu Y, Bai S, Zou S

Matching Biomedical Ontologies: Construction of Matching Clues and Systematic Evaluation of Different Combinations of Matchers

JMIR Med Inform 2021;9(8):e28212

DOI: 10.2196/28212

PMID: 34420930

PMCID: 8414291

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Feb 26, 2021

Date Accepted: May 19, 2021

Matching Biomedical Ontologies: Clues, Approach, and Scalability

ABSTRACT

Citation

Copyright