Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: May 27, 2024
Date Accepted: Dec 30, 2024

The final, peer-reviewed published version of this preprint can be found here:

Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopic

Wu X, Lam CS, Hui KH, Loong HHf, Zhou R, Ngan Ck, Cheung YT

Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopic

J Med Internet Res 2025;27:e60948

DOI: 10.2196/60948

PMID: 39928933

PMCID: 11851037

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Perception of Online Communities towards the Use of Cancer Immunotherapy: A Data Mining Study of 3.6 Million Web-based Posts from Social Media Platforms Using BERTopic

  • Xingyue Wu; 
  • Chun Sing Lam; 
  • Ka Ho Hui; 
  • Herbert Ho-fung Loong; 
  • Rui Zhou; 
  • Chun-kit Ngan; 
  • Yin Ting Cheung

ABSTRACT

Immunotherapy has become a game changer in cancer treatment. Few studies have investigated the perceptions about its use by analyzing social media data. This study aimed to use a topic modeling technique, BERTopic, to explore the perceptions of the online cancer communities regarding immunotherapy. A total of 4.9 million posts were extracted and preprocessed. BERTopic modeling was performed to identify topics from the posts, which were then broadly grouped into distinct themes. 3.6 million posts were remained for modeling after data cleaning. The highest overall topic quality achieved by BERTopic was 70.47% (topic diversity: 87.76%; topic coherence: 80.21%). BERTopic generated 14 topics related to the perceptions of immunotherapy, which were categorized into six themes. The themes primarily covered (1) hopeful prospects offered by immunotherapy, (2) perceived effectiveness of immunotherapy, (3) complementary therapies or self-treatments, (4) financial and mental impact of undergoing immunotherapy, (5) impact on lifestyle and time schedules, and (6) side effects due to treatment. This study provided an overview of the multifaceted considerations essential for the application of immunotherapy as a therapeutic intervention. Furthermore, it demonstrated the effectiveness of BERTopic in analyzing large amounts of data to identify perceptions underlying social media and online communities.


 Citation

Please cite as:

Wu X, Lam CS, Hui KH, Loong HHf, Zhou R, Ngan Ck, Cheung YT

Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopic

J Med Internet Res 2025;27:e60948

DOI: 10.2196/60948

PMID: 39928933

PMCID: 11851037

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.