Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Dec 8, 2022
Date Accepted: Aug 18, 2023
Construction of the Emotional Lexicon of Breast Cancer Patients
ABSTRACT
Background:
The innovative method of sentiment analysis based on emotional lexicon shows its prominent advantages in capturing emotional information, such as individual attitude, experience, and needs, which provides a new perspective and method for emotion recognition and management for breast cancer patients. However, at present, the sentiment analysis in the field of breast cancer is limited, and there is no emotional lexicon for this field. In view of this, it is urgent to construct an emotional lexicon that conforms to the characteristics of breast cancer patients, so as to provide a new tool for accurate identification and analysis of patients’ emotions and a new method for patients’ personalized emotion management.
Objective:
To construct the emotional lexicon of breast cancer patients.
Methods:
Emotional words were gained by merging the words in the general lexicons,C-LIWC and HowNet, and words in text corpora acquired from breast cancer patients via Weibo, semi-structured interview, and expressive writing. Ekman’s basic emotion categories, Lazarus’ cognitive appraisal theory of emotion, and a qualitative text analysis based on the text corpora of breast cancer patients were combined to determine the fine-grained emotion categories of the lexicon we constructed. The emotional lexicon of breast cancer patients was constructed by manual annotating and classifying under the guidance of Russell’s valence-arousal space. The precision, recall, and F1 were used to evaluate the lexicon’s performance.
Results:
The text corpora from different stages of breast cancer patients, including 150 written materials, 17 interviews, and 6,689 original posts and comments from Weibo, with a total of 1,923,593 Chinese characters, were collected. The emotional lexicon of breast cancer patients with 9,357 words and covering eight fine-grained emotion categories: joy, anger, sadness, fear, disgust, surprise, somatic symptoms, and breast cancer terminology, was eventually constructed. Experimental results showed that the precision, recall and F1 measure of positive emotional words were 98.42%, 99.73%, and 99.07% respectively; the results of negative emotional words were 99.73%, 98.38%, and 99.05% respectively, which all significantly outperforms the C-LIWC and HowNet.
Conclusions:
The emotional lexicon with fine-grained emotion categories was constructed, which conformed to the characteristics of patients living with breast cancer. The identifying and classifying performance of domain-specific emotional words in breast cancer were better than the C-LIWC and HowNet. This lexicon not only provided a new tool for sentiment analysis in the field of breast cancer, but also provided a new perspective for identifying the specific emotional state and needs of breast cancer patients and formulating tailored emotional management plans.
Citation
Per the author's request the PDF is not available.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.