Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Mar 7, 2022
Date Accepted: Apr 26, 2022
Date Submitted to PubMed: Apr 27, 2022

The final, peer-reviewed published version of this preprint can be found here:

Construction of a Linked Data Set of COVID-19 Knowledge Graphs: Development and Applications

Wang H, Du H, Qi G, Chen H, Hu W, Chen Z

Construction of a Linked Data Set of COVID-19 Knowledge Graphs: Development and Applications

JMIR Med Inform 2022;10(5):e37215

DOI: 10.2196/37215

PMID: 35476822

PMCID: 9109781

Construction of A Linked Dataset of COVID-19 Knowledge Graphs: Development and Applications

  • Haofen Wang; 
  • Huifang Du; 
  • Guilin Qi; 
  • Huajun Chen; 
  • Wei Hu; 
  • Zhuo Chen

ABSTRACT

Background:

With the continuous spread of COVID-19, information about the worldwide pandemic is exploding. It's necessary and significant to organize large information. As the key branch of AI, Knowledge Graph(KG) is helpful to structure, reason and understand data.

Objective:

To improve the utilization value of the information and effectively aid researchers to combat COVID-19, we have constructed and successively released a unified linked dataset OpenKG-COVID19, one of the largest knowledge graphs about COVID-19. OpenKG-COVID19 includes ten interlinked COVID-19 sub-graphs covering encyclopedia, concept, medical, research, event, health, epidemiology, goods, prevention, and character.

Methods:

In this paper, we introduce the key techniques exploited in building COVID-19 KGs in a top-down way. Firstly, the schema modelling process of each KG in OpenKG-COVID19 is described. Secondly, we propose different methods for extracting knowledge from open government sites, professional texts, public domain-specific sources, and public encyclopedia sites. The curated ten COVID-19 KGs are further linked together in both schema-level and data-level. In addition, we present the naming convention for OpenKG-COVID19.

Results:

OpenKG-COVID19 has more than 2,572 concepts, 329,600 entities, 513 properties and2,687,329 facts, and the dataset will be updated continuously. Each COVID-19 KG is evaluated, and the average precision is more than 93%. OpenKG-COVID19 dataset is available at the OpenKG website and please feel free to download particular sub-graphs. We have developed search and browse interfaces and a SPARQL endpoint to provide a more friendly access way. Possible intelligent applications based on OpenKG-COVID19 for further development are also listed in this paper.

Conclusions:

Knowledge Graph is known to be useful for intelligent question answering, semantic search, recommendation system, visualization analysis, and decision-making support. Research related to COVID-19, bio-medicine, and many other communities can benefit from OpenKG-COVID19. Furthermore, the ten KGs will be continuously updated to ensure that the public could access sufficient and up-to-date knowledge.


 Citation

Please cite as:

Wang H, Du H, Qi G, Chen H, Hu W, Chen Z

Construction of a Linked Data Set of COVID-19 Knowledge Graphs: Development and Applications

JMIR Med Inform 2022;10(5):e37215

DOI: 10.2196/37215

PMID: 35476822

PMCID: 9109781

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.