Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Jan 29, 2018
Open Peer Review Period: Jan 30, 2018 - Jul 19, 2018
Date Accepted: Jul 19, 2018
(closed for review but you can still tweet)
Construction Principles and Approaches of an Ontology-Based Knowledge Base Prototype on Cystic Fibrosis
ABSTRACT
Background:
Ontologies are key enabling technologies for the Semantic Web. The Web Ontology Language (OWL) is a semantic markup language for publishing and sharing ontologies.
Objective:
The supply of customizable, computable, and formally represented molecular genetics information and health information, via electronic health record (EHR) interfaces, can play a critical role in achieving precision medicine. In this study, we used cystic fibrosis as an example to build an ontology-based knowledge base prototype (OntoKBCF) to supply such information via an EHR prototype. In addition, we elaborate on the construction and representation principles, approaches, applications, and representation challenges that we faced in the construction of OntoKBCF. The principles and approaches can be referenced and applied in constructing other ontology-based domain knowledge bases.
Methods:
First, we defined the scope of OntoKBCF according to possible clinical information needs about cystic fibrosis on both a molecular level and a clinical phenotype level. We then selected the knowledge sources to be represented in OntoKBCF. We utilized top-to-bottom content analysis and bottom-up construction to build OntoKBCF. Protégé-OWL was used to construct OntoKBCF. The construction principles included (1) to use existing basic terms as much as possible; (2) to use intersection and combination in representations; (3) to represent as many different types of facts as possible; and (4) to provide 2-5 examples for each type. HermiT 1.3.8.413 within Protégé-5.1.0 was used to check the consistency of OntoKBCF.
Results:
OntoKBCF was constructed successfully, with the inclusion of 408 classes, 35 properties, and 113 equivalent classes. OntoKBCF includes both atomic concepts (such as amino acid) and complex concepts (such as “adolescent female cystic fibrosis patientâ€) and their descriptions. We demonstrated that OntoKBCF could make customizable molecular and health information available automatically and usable via an EHR prototype. The main challenges include the provision of a more comprehensive account of different patient groups as well as the representation of uncertain knowledge, ambiguous concepts, and negative statements and more complicated and detailed molecular mechanisms or pathway information about cystic fibrosis.
Conclusions:
Although cystic fibrosis is just one example, based on the current structure of OntoKBCF, it should be relatively straightforward to extend the prototype to cover different topics. Moreover, the principles underpinning its development could be reused for building alternative human monogenetic diseases knowledge bases.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.