JMIR Preprints #18395: Phenotypically Similar Rare Disease Identification from an Integrative Knowledge Graph for Data Harmonization

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Phenotypically Similar Rare Disease Identification from an Integrative Knowledge Graph for Data Harmonization

Qian Zhu;
Dac-Trung Nguyen;
Gioconda Alyea;
Karen Hanson;
Eric Sid;
Anne Pariser

ABSTRACT

Background:

Rare diseases can often be hard to diagnose precisely due to the limited exposure many primary health care providers may have had. This can lead to missed, delayed or inaccurate diagnoses even when an approved, effective therapy is available. Although many efforts have been made to develop comprehensive disease resources that capture rare disease information for the purpose of clinical decision making and education, there is no single, standardized method to define and harmonize rare diseases across multiple resources. This introduces a certain level of redundancy and inconsistency that may ultimately increase confusion and difficulty for wide use of these resources. To overcome such encumbrance and decrease the need for human curation and maintenance effort, we report our initial work to identify related diseases presenting in the Genetic And Rare Diseases (GARD) database for supporting further data harmonization.

Objective:

We aimed to systematically determine disease relevance among rare diseases from the GARD database, and establish systematic rules for data harmonization. Ultimately, the results generated from this study can be one potential rare disease resource for clinical decision support.

Methods:

In this paper, we computed disease similarity among the GARD diseases based on their mappings to several well-known rare disease resources and aligned human adjudgment to further evaluate and categorize those relevant disease pairs into pre-defined disease relevance groups. In addition, we adopted disease relevance presenting among siblings from disease classification trees, and prioritized relevant diseases based on a number of shared phenotypes.

Results:

By utilizing the GARD disease mappings to several well-known rare disease resources, we computed disease similarity, about 86% (339) disease pairs identified as relevant, of which 68% disease pairs (268) had similarity scores greater than 0.5. On the other hand, by scanning disease classification trees from MONDO and Orphanet, total 102,034 disease pairs with one and more shared clinical phenotypes were identified as relevant. Manual evaluation shows 88% of accuracy of prioritizing relevant disease with clinical phenotypes.

Conclusions:

We successfully identified relevant rare diseases from the GARD database via two different approaches, i.e., disease similarity comparison and disease relevance adoption from disease siblings. The results will not only direct the GARD data harmonization for use in expanding translational science research, but also accelerate data transparence and consistence across different disease resources/terminologies, towards the most robust and up-to-date knowledge on rare diseases.

Citation

Please cite as:

Zhu Q, Nguyen DT, Alyea G, Hanson K, Sid E, Pariser A

Phenotypically Similar Rare Disease Identification from an Integrative Knowledge Graph for Data Harmonization: Preliminary Study

JMIR Med Inform 2020;8(10):e18395

DOI: 10.2196/18395

PMID: 33006565

PMCID: 7568218

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Feb 24, 2020

Date Accepted: Aug 19, 2020

Phenotypically Similar Rare Disease Identification from an Integrative Knowledge Graph for Data Harmonization

ABSTRACT

Citation

Copyright

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Feb 24, 2020

Date Accepted: Aug 19, 2020

Phenotypically Similar Rare Disease Identification from an Integrative Knowledge Graph for Data Harmonization

ABSTRACT

Citation

Per the author's request the PDF is not available.

Copyright