Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Jan 24, 2020
Date Accepted: Jul 17, 2020
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
TranQL: An Interactive Query and Visualization Environment for Federated Knowledge Graphs
ABSTRACT
Background:
Large knowledge graphs linking biomedical objects based on curated assertions are now commonplace. Efforts are underway to semantically integrate these knowledge graphs using common upper-level ontologies to federate graph-oriented application programming interfaces to the data. However, federation poses challenges in that any query system must route queries to the appropriate knowledge sources, generate and evaluate answer subsets, and then merge answer subsets into a semantically valid, coherent whole for visualization and exploration by users.
Objective:
We aimed to develop an interactive environment for query, visualization, and deep exploration of federated knowledge graphs.
Methods:
We developed TranQL (Translator Query Language) as a biomedical query language and web application interface to query semantically federated knowledge graphs and explore query results. TranQL leverages the framework developed as part of the Biomedical Data Translator program (‘Translator’), funded by the National Center for Advancing Translational Sciences, National Institutes of Health. Specifically, TranQL uses the BioLink data model as an upper-level biomedical ontology and the Translator Knowledge Graph Standard (KGS) application programming interface (API) to specify a protocol for expressing a query as a graph of BioLink data elements compiled from statements in the TranQL query language. Queries are mapped to federated knowledge sources, and answers are merged into a knowledge graph, with mappings between the knowledge graph and specific elements of the query. The TranQL interactive web application includes: a TranQL backplane service, which is an OpenAPI that provides a protocol normalization layer over federated Translator KGS API endpoints; the TranQL query service, which uses the TranQL schema to describe the semantics of each Translator KGS API endpoint; and the TranQL user interface (UI), which is a single-page web application that includes a query editor, cache function, graph visualization environment, answer viewer, and visualization controls.
Results:
We developed real-world use cases to validate TranQL and address biomedical questions of relevance to translational science. The use cases posed questions that traversed two federated Translator KGS API endpoints. The first endpoint was ICEES (Integrated Clinical and Environmental Exposures Service), which provides open access to clinical data that have been integrated with a variety of public environmental exposures data. The second endpoint was ROBOKOP, which is an open question-answering system that provides access to linked biomedical entities such as ‘gene’, ‘chemical substance’, and ‘disease’ that are derived largely from curated public data sources. We successfully posed queries to TranQL that traversed these endpoints and retrieved answers that we then visualized and evaluated.
Conclusions:
TranQL can be used to ask questions of relevance to translational science, rapidly obtain answers that require assertions from a federation of knowledge sources, and provide valuable insights for translational research and clinical practice.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.