Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Jul 1, 2024
Open Peer Review Period: Jul 8, 2024 - Sep 2, 2024
Date Accepted: Feb 17, 2025
(closed for review but you can still tweet)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Systematic Identification of Caregivers of Patients Living with Dementia in the Electronic Health Record: Known Contacts and Natural Language Processing
ABSTRACT
Background:
Systemically identifying caregivers in the electronic health record (EHR) is a critical step for delivering patient-centered care, enhancing care coordination, and advancing research and population health efforts in caregiving. Despite EHRs being effective in identifying patients through standardized data fields like demographics, lab results, medications, and diagnoses, identifying caregivers through the EHR is challenging in the absence of specific caregiver fields.
Objective:
Recognizing the complexity of identifying caregiving networks of people living with dementia (PLWD), this study aimed to systematically capture caregiver information by combining EHR structured fields and unstructured notes and free text.
Methods:
Among a cohort of PLWD aged ≥60 from Kaiser Permanente Colorado (KPCO) caregiver names were identified by combining structured patient contact fields, i.e. known contacts, with name-matching and natural language processing (NLP) techniques of unstructured notes and patient portal messages containing caregiver terms.
Results:
Among the cohort of N=789 PLWD, 95% had at least one caregiver name listed in structured fields (mean=2.1). Over 95% of the cohort had caregiver terms mentioned near a known contact name in unstructured encounter notes, with 35% having a full name match in unstructured patient portal messages. The NLP model identified 7,556 “new” names in the unstructured EHR text containing caregiver terms among 99% of the cohort with high accuracy and reliability (F1=.85, precision=.89, recall=.82). 87% of the cohort had a new name identified ≥2 near a caregiver term in their notes and portal messages.
Conclusions:
Analysis revealed significant patterns in caregiver-related information distributed across structured and unstructured EHR fields, emphasizing the importance of integrating both data sources for a comprehensive understanding of caregiving networks. A framework was developed to systematically identify potential caregivers across caregiving networks using structured and unstructured EHR data. This approach has the potential to improve health services for PLWD and their caregivers.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.