Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Dec 22, 2022
Open Peer Review Period: Dec 22, 2022 - Jan 9, 2023
Date Accepted: Mar 12, 2023
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Leveraging Knowledge Graphs and Natural Language Processing for Automated Web Resource Labeling and Knowledge Mobilization in Neurodevelopmental Disorders: Development and Usability Study

Costello J, Kaur M, Reformat MZ, Bolduc FV

Leveraging Knowledge Graphs and Natural Language Processing for Automated Web Resource Labeling and Knowledge Mobilization in Neurodevelopmental Disorders: Development and Usability Study

J Med Internet Res 2023;25:e45268

DOI: 10.2196/45268

PMID: 37067865

PMCID: 10152329

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Leveraging Knowledge Graph and Natural Language Processing for Automated Web Resource Labeling: Knowledge Mobilization in Neurodevelopmental Disorders.

  • Jeremy Costello; 
  • Manpreet Kaur; 
  • Marek Z Reformat; 
  • Francois V Bolduc

ABSTRACT

Background:

Providing patients and families with trusted information is needed more than ever with the abundance of online information. Several organizations aim to build databases which can be searched based on needs by target groups. One such group is individuals with neurodevelopmental disabilities (NDD) and their families. NDDs affect up to 18% of the population and have major social and economic impacts. Current limitations in communicating information for individuals with NDDs include the absence of shared terminology and lack of efficient labeling processes for web resources. This leads to an inability for health professionals, support groups and families to share, combine and access resources.

Objective:

We aim to develop a natural language-based pipeline to label resources by leveraging standard vocabularies and free-text vocabulary obtained through text analysis and then representing those resources as a weighted knowledge graph.

Methods:

Using a combination of experience-experts and service/organization databases, we created a dataset of web resources for NDD. Text from these websites is scraped and used collected into a corpus of textual data on neurodevelopmental disorders. This corpus is used to construct a knowledge graph suitable for use by both experts and non-experts. Named entity recognition, topic modelling, document classification, and location detection are used to extract knowledge from the corpus.

Results:

We developed a resource annotation pipeline using diverse natural language processing algorithms to annotate web resources and store them in a structured knowledge graph containing 78,181 annotations obtained from the combination of standard terminologies and a free-text vocabulary obtained using topic modelling. An application of the constructed knowledge graph is illustrated: a resource search interface using the ordered weighted averaging operator to rank resources based on a user query.

Conclusions:

This automated labeling pipeline for web resources on NDDs and use of knowledge graph will showcase how AI can enhance knowledge extraction and mobilization in NDD but also in other fields of medicine in the future.


 Citation

Please cite as:

Costello J, Kaur M, Reformat MZ, Bolduc FV

Leveraging Knowledge Graphs and Natural Language Processing for Automated Web Resource Labeling and Knowledge Mobilization in Neurodevelopmental Disorders: Development and Usability Study

J Med Internet Res 2023;25:e45268

DOI: 10.2196/45268

PMID: 37067865

PMCID: 10152329

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.