Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Dec 22, 2022
Open Peer Review Period: Dec 22, 2022 - Jan 9, 2023
Date Accepted: Mar 12, 2023
(closed for review but you can still tweet)
Leveraging Knowledge Graphs and Natural Language Processing for Automated Web Resource Labeling: Knowledge Mobilization in Neurodevelopmental Disorders.
ABSTRACT
Background:
Providing patients and families with trusted information is needed more than ever with the abundance of online information. Several organizations aim to build databases which can be searched based on needs by target groups. One such group is individuals with neurodevelopmental disabilities (NDD) and their families. NDDs affect up to 18% of the population and have major social and economic impacts. Current limitations in communicating information for individuals with NDDs include the absence of shared terminology and lack of efficient labeling processes for web resources. This leads to an inability for health professionals, support groups and families to share, combine and access resources.
Objective:
We aim to develop a natural language-based pipeline to label resources by leveraging standard vocabularies and free-text vocabulary obtained through text analysis and then representing those resources as a weighted knowledge graph.
Methods:
Using a combination of experience-experts and service/organization databases, we created a dataset of web resources for NDD. Text from these websites is scraped and used collected into a corpus of textual data on neurodevelopmental disorders. This corpus is used to construct a knowledge graph suitable for use by both experts and non-experts. Named entity recognition, topic modelling, document classification, and location detection are used to extract knowledge from the corpus.
Results:
We developed a resource annotation pipeline using diverse natural language processing algorithms to annotate web resources and store them in a structured knowledge graph containing 78,181 annotations obtained from the combination of standard terminologies and a free-text vocabulary obtained using topic modelling. An application of the constructed knowledge graph is illustrated: a resource search interface using the ordered weighted averaging operator to rank resources based on a user query.
Conclusions:
This automated labeling pipeline for web resources on NDDs and use of knowledge graph will showcase how AI can enhance knowledge extraction and mobilization in NDD but also in other fields of medicine in the future.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.