Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Feb 21, 2020
Date Accepted: Mar 29, 2020
Towards a Knowledge Graph of Combined Drug Therapies using Semantic Predications from Biomedical Literature
ABSTRACT
Background:
Combined medication is playing an important role in the effective treatment of malignant neoplasms and the development of precision medicine. A large number of clinical studies have already been carried out to the investigation of combined drug therapies. Automatic knowledge discovery of these combinations as well as their graphic presentation through knowledge graph are critical to recognize the pattern about which two or more drugs are combined to treat a specific type of cancer, as well as improve the drug efficacy and treatment of human disorders.
Objective:
This paper aims to develop an automatic and visualized approach to discover knowledge about combined drug therapies from biomedical literatures, especially from those with high level evidence such as clinical trial reports and clinical practice guidelines.
Methods:
Based on semantic predications which consist of a triple structure of subject-predicate-object (SPO), an automatic algorithm is proposed to discover knowledge of combined drug therapies using the rules that 1) two or more semantic predications (S1-P-O and Si-P-O, i = 2, 3…) can be extracted from one conclusive claim (sentence) in abstract of a given publication, and 2) these predications have an identical predicate (which closely relates to human disease treatment, e.g., treat) and object (e.g., disease) but different subjects (e.g., drugs). A customized knowledge graph is provided to organize and visualize these combinations instead of traditional semantic triples. After automatic filtering out such broad concept as “pharmacologic actions” and generic disease names, a set of combined drug therapies were identified and characterized through manual interpretation.
Results:
We retrieved 22,263 clinical trial reports and 31 clinical practice guidelines from PubMed abstracts by searching “antineoplastic agents” for drug restriction (Jan 2009 - Oct 2019). There are 15,603 conclusive claims locally parsed using cue words “conclusion*” and “conclude*” ready for semantic predications extraction by SemRep, and 325 candidate groups of semantic predications about combined medications were automatically discovered from 316 conclusive claims. Based on manual analysis, we determined that 255 of them (78.46%) were accurate combined drug therapies and then adopted them for customized knowledge graph construction. We also identified two categories (and 4 subcategories) to explain those inaccurate results: limitations of SemRep, limitations of proposal. Besides, we further learned the predominant patterns of combinations according to the mechanisms of drugs for new combined medication study, and discovered four obvious markers (“combin*”, “coadministration”, “co-administered” and “regimen”) to identify potential combined drug therapies for machine learning algorithm development in the future work.
Conclusions:
Using the semantic predications from conclusive claim in the biomedical literature can support automatic knowledge discovery and knowledge graph construction of combined drug therapies. A machine learning-based approach is warranted to take full advantages of identified markers and other features of context.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.