JMIR Preprints #103305: Large Language Model-Assisted Measurement of ADRD Caregiver Burden on Reddit

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Large Language Model-Assisted Measurement of ADRD Caregiver Burden on Reddit

Yan Zhang;
Mingfei Zhang

ABSTRACT

Background:

Informal caregivers of older adults with Alzheimer’s disease and related dementias (ADRD) often face high burden and chronic stress. Reddit provides naturally occurring caregiver narratives to study caregiving burden, but prior studies have often relied on small manual coding or aggregate topic models that may not easily connect caregiving context, burden, and stressor at scale.

Objective:

This study aimed to apply a large language model (LLM)-assisted measurement workflow to characterize ADRD caregiver burden, caregiving contexts, stressor, temporal patterns, and high-burden subgroups in Reddit posts.

Methods:

We processed monthly Reddit submission archives from January 2020 through June 2025. Posts were selected from 5 ADRD-specific subreddits and 4 broader caregiving or aging subreddits, yielding a balanced 13,200-post corpus. A locally hosted open-weight LLM inferred structured post-level variables including caregiver sociodemographic characteristics, caregiver role, ADRD stage, care setting, support need, dominant emotion, sentiment, burden, and primary and secondary stressors. The workflow included schema validation, uncertainty ratings, field-level correction prompts, source matching, a 150-post double-coded human validation benchmark, and pairwise subgroup analyses.

Results:

The source-matched analytic sample included 13,179 LLM-coded posts. Human agreement was high for caregiver role, ADRD stage, care setting, dominant emotion, support need type, burden group, and stressor categories (κ= 0.77–1.00), but poor for the initial in-scope screen (κ= 0.00). Adult children were the largest inferred caregiver group (8266/13,179, 62.7%), followed by grandchildren (2012/13,179, 15.3%). Moderate ADRD was the modal inferred stage (6554/13,179, 49.7%), and home was the modal care setting (7145/13,179, 54.2%). Mean burden increased from early-stage ADRD (5.38) to moderate-stage ADRD (6.96) and late-stage ADRD (7.77). Behavioral strain was the dominant primary stressor, while family conflict, financial strain, emotional strain, and social isolation were leading secondary stressors. Pairwise subgroup analysis showed high burden in late-stage ADRD with emotional support needs (mean 8.42), care transitions with emotional support needs (mean 8.45), and late-stage care transitions (mean 8.40).

Conclusions:

LLM-assisted measurement can help convert unstructured caregiver narratives into post-level, theory-guided quantitative measures when paired with validation and cautious interpretation. This workflow extends prior online caregiver research by linking caregiving context, burden, and stressor pathways at scale. The findings highlight care transitions, emotional support needs, financial navigation, and family role negotiation as potential targets for caregiver-support interventions. Clinical Trial: Not applicable

Citation

Please cite as:

Zhang Y, Zhang M

Large Language Model-Assisted Measurement of ADRD Caregiver Burden on Reddit

JMIR Preprints. 01/06/2026:103305

DOI: 10.2196/preprints.103305

URL: https://preprints.jmir.org/preprint/103305

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Currently submitted to: JMIR Aging

Date Submitted: Jun 1, 2026

Open Peer Review Period: Jun 2, 2026 - Jul 28, 2026

(closed for review but you can still tweet)

NOTE: This is an unreviewed Preprint

Large Language Model-Assisted Measurement of ADRD Caregiver Burden on Reddit

ABSTRACT

Citation

Copyright