Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently accepted at: JMIR Cancer

Date Submitted: Apr 15, 2025
Date Accepted: Feb 23, 2026

This paper has been accepted and is currently in production.

It will appear shortly on 10.2196/76044

The final accepted version (not copyedited yet) is in this tab.

Extracting Quality of Life Information of Breast Cancer Patients from Healthcare Online Forum Posts: A Feasibility Study

  • David Maria Schmidt; 
  • Raoul Schubert; 
  • Brian Chen; 
  • Deborah Kuk; 
  • Valmeek Kudesia; 
  • Andreas Hinz; 
  • Philipp Cimiano

ABSTRACT

Background:

Quality of Life (QoL) questionnaires are used in many disease areas to measure the burden that a disease causes for patients, and provide insights into disease impact, unmet medical needs, and can inform patient-centered drug development and value assessment for treatments. The collection of data imposes both a significant burden on patients as well as effort on health care personnel, thus incurring high costs for the healthcare system. Given that patients share detailed information about their condition and treatment experiences on social media and patient fora, an important research question is to what extent information about QoL can be obtained from patients’ online forum posts to potentially complement information obtained from questionnaires.

Objective:

The objective of this study is to assess how far QoL information can be gained from the analysis of posts by patients in online healthcare communities and whether this information is rich enough to estimate individual patient’s QoL based on their posts. We carry out this feasibility study in the context of breast cancer as it is the most prevalent cancer in the female population.

Methods:

We recruited 134 female breast cancer patients on the Inspire.com patient online forum, who voluntarily participated in our feasibility study. They filled in the EORTC QLQ-C30 and QLQ-BR23 questionnaires consisting of 30 general questions and 23 additional breast cancer-specific questions, and provided consent to analyze their posts and comments on the online forum (756 posts, 19478 comments). Posts were coded by human coders to identify parts of the text providing answers to one of the above-mentioned 53 questions.

Results:

The data annotation yielded a substantial agreement (mean Fleiss’ Kappa of 0.5). Overall, we found answers in the coded data for 50 out of 53 EORTC QLQ-C30 and QLQ-BR23 questions. The information coded in the posts reliably predicted the answers given in the questionnaires (F1 = 0.7). The 5 questions that were most frequently answered on the basis of the coded posts were: “Did you feel ill or unwell?” (304 of 2683 annotated posts and comments), “Did you worry?” (105 posts and comments), “Have you had pain?” (104 posts and comments), “Did you feel tense?” (85 posts and comments), and “Were you limited in doing either your work or other daily activities?” (77 posts and comments).

Conclusions:

Our feasibility study shows that there is valuable QoL-related information in posts of online patient communities. Future research should consider how these insights can be used to complement existing QoL instruments.


 Citation

Please cite as:

Schmidt DM, Schubert R, Chen B, Kuk D, Kudesia V, Hinz A, Cimiano P

Extracting Quality of Life Information of Breast Cancer Patients from Healthcare Online Forum Posts: A Feasibility Study

JMIR Cancer. 23/02/2026:76044 (forthcoming/in press)

DOI: 10.2196/76044

URL: https://preprints.jmir.org/preprint/76044

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.