Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Jun 20, 2024
Date Accepted: Oct 14, 2024

The final, peer-reviewed published version of this preprint can be found here:

Identifying Patient-Reported Care Experiences in Free-Text Survey Comments: Topic Modeling Study

Steele B, Fairie P, Kemp K, D'Souza AG, Wilms M, Santana MJ

Identifying Patient-Reported Care Experiences in Free-Text Survey Comments: Topic Modeling Study

JMIR Med Inform 2025;13:e63466

DOI: 10.2196/63466

PMID: 39993226

PMCID: 11875393

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Patient-reported inpatient experiences and natural language processing: Unsupervised topic modeling to identify care experiences in free-text comments

  • Brian Steele; 
  • Paul Fairie; 
  • Kyle Kemp; 
  • Adam G. D'Souza; 
  • Matthias Wilms; 
  • Maria Jose Santana

ABSTRACT

Background:

Patient-reported experience surveys allow administrators, clinicians, and researchers to quantify and improve healthcare by receiving feedback directly from patients. Existing research has focused primarily on quantitative analysis of survey items, but these measures may collect optional free-text comments. These comments can provide insights for health systems but may not be analyzed due to limited resources and the complexity of traditional textual analysis. However, advances in machine learning-based natural language processing (NLP) provide opportunities to learn from this traditionally underutilized data source.

Objective:

To apply natural language processing to model topics found in free-text comments of patient-reported experience surveys.

Methods:

CAHPS-derived patient experience surveys were collected and linked to administrative inpatient records by the provincial health services organization responsible for inpatient care. Unsupervised topic modeling with automated labeling was performed with BERTopic in Python. Sentiment analysis was performed using {sentimentr} in R to assist in topic description.

Results:

Between April 2016 and February 2020, 43.4% (n = 43,522) adult patients and 46.9% (n = 3,501) pediatric caregivers completed patient experience surveys that included free-text responses. Topic models identified 86 topics among adult survey responses and 35 topics among pediatric responses that included elements of care not currently surveyed by existing questionnaires. Frequent topics were generally positive.

Conclusions:

We found that with limited tuning, BERTopic identified care experience topics with interpretable automated labeling. Results are discussed in the context of person-centered care, patient safety, and healthcare quality improvement. Further, we note the opportunity for the identification of temporal and site-specific trends as a method to identify patient care and safety concerns. As the use of patient experience measurement increases in healthcare, we discuss how machine learning can be leveraged to provide additional insight on patient experiences.


 Citation

Please cite as:

Steele B, Fairie P, Kemp K, D'Souza AG, Wilms M, Santana MJ

Identifying Patient-Reported Care Experiences in Free-Text Survey Comments: Topic Modeling Study

JMIR Med Inform 2025;13:e63466

DOI: 10.2196/63466

PMID: 39993226

PMCID: 11875393

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.