Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Dec 21, 2022
Date Accepted: Sep 22, 2023
Potential Target Discovery and Drug Repurposing for Coronavirus: A Knowledge Graph-Based Approach
ABSTRACT
Background:
The global pandemics of SARS, MERS, COVID-19 and other coronavirus-caused diseases have caused unprecedented crises on public health. The coronavirus is constantly evolving, what the new coronavirus is, and when the next coronavirus will sweep the world is unknown. Knowledge graph is expected to discover the pathogenicity and transmission mechanism of viruses.
Objective:
The aim of this study is to discover potential targets and candidate drugs to repurpose for coronavirus through a knowledge graph-based approach.
Methods:
We propose a computational and evidence-based knowledge discovery approach to identify potential targets and candidate drugs for coronavirus from biomedical literatures and well-known knowledge bases. To organize the semantic triples extracted automatically from biomedical literatures, a semantic conversion model was designed. The literature knowledge was associated and integrated with the existing drug and gene knowledge through semantic mapping, and the coronavirus knowledge graph (CovKG) was constructed. We utilized knowledge graph technologies such as SPARQL query and logical reasoning, and adopted six state-of-the-art knowledge graph embedding models to discover unrecorded drug mechanisms of action as well as potential targets and drug candidates. Furthermore, we provided the evidence-based supports with the backtracking and triple co-occurrence mechanism.
Results:
The constructed CovKG contains 17,369,620 triples, of which 641,195 were extracted from biomedical literatures, covering 13,065 CUIs, 209 semantic types and 97 semantic relations of UMLS. Through multi-source knowledge integration, 475 drugs and 262 targets were mapped to existing knowledge, and 41 new drug mechanisms of action were found by traditional logical reasoning techniques, which were not recorded in the existing knowledge base. Among 6 knowledge graph embedding models, TransR outperformed others (MRR = 0.2430, Hits@10 = 0.3279). 33 potential targets and 18 drug candidates were identified for coronavirus. Among them, 7 novel drugs (i.e. quinine, nelfinavir, ivermectin, asunaprevir, tylophorine, A. annua and resveratrol) and 3 highly ranked targets (i.e. angiotensin-converting enzyme 2, TMPRSS2, M proteins) are further discussed.
Conclusions:
We showed the effective approach of knowledge graph in potential target discovery and drug repurposing for coronavirus. Our approach can be extended to other viruses or diseases for biomedical knowledge discovery and relevant applications. Source code and data are available at https://github.com/lp/CovKG.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.