Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Dec 30, 2019
Open Peer Review Period: Dec 30, 2019 - Jan 10, 2020
Date Accepted: Apr 3, 2020
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Using Natural Language Processing Techniques to Provide Personalized Educational Materials for Chronic Disease Patients in China: Development and Assessment of a Knowledge-Based Health Recommender System

Wang Z, Huang H, Cui L, An J, Duan H, Ge H, Deng N

Using Natural Language Processing Techniques to Provide Personalized Educational Materials for Chronic Disease Patients in China: Development and Assessment of a Knowledge-Based Health Recommender System

JMIR Med Inform 2020;8(4):e17642

DOI: 10.2196/17642

PMID: 32324148

PMCID: 7206519

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Using Natural Language Processing Techniques to Provide Personalized Health Education for Chronic Disease Patients: Implementation of A Knowledge-based Health Recommender System

  • Zheyu Wang; 
  • Haoce Huang; 
  • Liping Cui; 
  • Jiye An; 
  • Huilong Duan; 
  • Huiqing Ge; 
  • Ning Deng

ABSTRACT

Background:

Health education is an important intervention for improving chronic disease patients’ awareness and self-management abilities. The rapid development of information technologies changes the form of patient education materials from traditional paper materials to electronic materials. To date, the amount of educational materials on the Internet is tremendous and their quality is highly variable. Patients without a medical background may find it hard to distinguish the most valuable materials for themselves.

Objective:

The aim of this study is to develop a health recommender system to recommend appropriate educational materials to chronic disease patients.

Methods:

We implemented a knowledge-based recommender system using ontology and several natural language processing (NLP) techniques. The development process was divided into 2 stages. In stage 1, we constructed an ontology for chronic disease patient education aiming to understand and analyze patient data. In stage 2, we implemented an algorithm to generate the recommendations based on the ontology. Patient data and educational materials were mapped to the ontology and converted into vectors with the same length, then the recommendations were generated based on the similarity of these vectors. We used keyword extraction algorithms and pre-trained word embeddings to preprocess the educational materials. Concretely, the term frequency-inverse document frequency (TF-IDF) and TextRank methods were adopted to extract keywords; the word2vec model was adopted to train the word embeddings. We also proposed three strategies to improve the keyword extraction performance. The evaluation was based on a manually assembled gold standard dataset for 50 patients and 100 educational materials. The recommendation performance was assessed using the macro precision of top-ranked documents.

Results:

The constructed Chronic Disease Patient Education Ontology (CDPEO) mainly consisted of two levels. Level 1 included 5 terms: demographic, disease, physiological index, lifestyle and medication, which describe the characteristics contained in the patient data, meanwhile corresponding to the topics of educational materials. Level 2 contained the detailed elements for each Level 1 class. The ontology vector is a 32-dimensional vector generated from the Level 2 classes. In the keyword extraction performance evaluation, the improved TextRank algorithm achieved the best precision of 53.2%, compared with the manual extraction results. In the recommendation performance evaluation, the improved TF-IDF method achieved the highest macro precision of 97% at the top 1 recommendation.

Conclusions:

This study implemented a knowledge-based health recommender system to provide personalized health education for chronic disease patients. The system proved to be effective and we learned from the study that efficient NLP techniques for preprocessing education materials are crucial to such systems.


 Citation

Please cite as:

Wang Z, Huang H, Cui L, An J, Duan H, Ge H, Deng N

Using Natural Language Processing Techniques to Provide Personalized Educational Materials for Chronic Disease Patients in China: Development and Assessment of a Knowledge-Based Health Recommender System

JMIR Med Inform 2020;8(4):e17642

DOI: 10.2196/17642

PMID: 32324148

PMCID: 7206519

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.