JMIR Preprints #92325: Beyond Single Topics: Quantifying Information Loss by Comparing GPT-Based Aspect Sentiment Analysis With LDA in Hospital Reviews

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Beyond Single Topics: Quantifying Information Loss by Comparing GPT-Based Aspect Sentiment Analysis With LDA in Hospital Reviews

Jung-Tang Hsueh;
Sheng-Hsun Hsu;
Shwu-Fen Chiu

ABSTRACT

Background:

Healthcare service quality is inherently multidimensional, yet document-level text analysis methods such as Latent Dirichlet Allocation (LDA) force patient reviews into single dominant topics. This simplification may systematically discard evaluative information when patients discuss multiple service dimensions with varying sentiments within the same review.

Objective:

This study compared document-level topic modeling (LDA) with GPT-based aspect-level sentiment analysis (ABSA) to address three research questions: (1) How much information is lost when collapsing multi-aspect reviews to single topics? (2) How prevalent are mixed-sentiment reviews, and what quality tensions do they reveal—both cross-aspect trade-offs and within-aspect ambivalence? (3) Do positive and negative reviews exhibit different structural patterns in aspect co-occurrence?

Methods:

We analyzed 2024 Google Reviews from 24 medical centers in Taiwan. Both LDA (K=7 topics) and GPT-based ABSA were applied to the same 5,467 reviews, ensuring fair comparison on identical data. The ABSA design employed structured prompts to extract aspects from seven predefined quality dimensions. Quality validation achieved Cohen κ=.82 against human annotation. Mixed-sentiment reviews were identified as those containing both positive and negative aspect evaluations, and cross-polarity couplings were analyzed to identify recurring trade-off patterns. Rating-stratified network analysis compared aspect co-occurrence patterns between positive reviews and negative reviews using Jaccard similarity.

Results:

Reviews discussed an average of 2.05 distinct aspects (SD=0.97), producing 51.2% information loss under LDA's single-topic assignment. Among multi-aspect reviews, 11.0% exhibited cross-aspect mixed sentiment, with Technical–Functional Divergence—praising Professional Quality while criticizing functional dimensions—appearing in 49.9% of these mixed-sentiment cases. Network analysis revealed differential bundling: operational dimensions co-occurred more strongly in negative reviews, whereas clinical dimensions co-occurred more strongly in positive reviews.

Conclusions:

Document-level topic modeling discards more than half of the evaluative information patients provide. Our findings reveal that patients cognitively decouple clinical competence from service delivery—Technical–Functional Divergence appeared in half of mixed-sentiment cases—and that positive and negative reviews organize quality dimensions differently. We recommend a complementary approach: topic modeling for exploratory discovery and ABSA for diagnostic assessment. For healthcare quality improvement, hospitals should separate clinical signals from operational signals in feedback dashboards.

Citation

Please cite as:

Hsueh JT, Hsu SH, Chiu SF

Beyond Single Topics: Quantifying Information Loss by Comparing GPT-Based Aspect Sentiment Analysis With LDA in Hospital Reviews

JMIR Preprints. 08/03/2026:92325

DOI: 10.2196/preprints.92325

URL: https://preprints.jmir.org/preprint/92325

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Currently submitted to: Journal of Medical Internet Research

Date Submitted: Mar 8, 2026

Open Peer Review Period: Mar 9, 2026 - May 4, 2026

(currently open for review)

Beyond Single Topics: Quantifying Information Loss by Comparing GPT-Based Aspect Sentiment Analysis With LDA in Hospital Reviews

ABSTRACT

Citation

Copyright