Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Mar 29, 2024
Open Peer Review Period: Apr 1, 2024 - May 27, 2024
Date Accepted: Aug 17, 2024
(closed for review but you can still tweet)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Automated System to Capture Patient Symptoms from Multimodal Texts: Natural Language Processing Approach
ABSTRACT
Background:
Natural language processing (NLP) techniques can be used to process large amounts of electronic health record (EHR) texts containing various types of patient information such as quality of life (QoL), effectiveness of treatments, and adverse drug event (ADE) signals. However, as different aspects of a patient status are contained in different types of documents, we propose an NLP system capable of processing six types of documents: physician progress notes, discharge summaries, radiology reports, radioisotope (RI) reports, nursing records, and pharmacist progress notes.
Objective:
This study investigated the system performance in detecting ADEs by exploiting the results from multimodal texts. The main objective was to determine the extent to which the system outputs from multimodal texts, such as certain ADEs, are consistent with outcomes from manual methods in existing reports.
Methods:
Data from 2,289 patients with breast cancer, including medication data, physician progress notes, discharge summaries, radiology reports, RI reports, nursing records, and pharmacist progress notes, were used. We used a language processing system that performs three linguistic processes: named-entity recognition (NER), factuality determination, and medical term normalization. Among all patients with breast cancer, 103 and 112 with peripheral neuropathy received paclitaxel (PTX) or docetaxel (DTX), respectively.
Results:
The incidence of PTX-induced peripheral neuropathy was 60.7% after 30 days, with a relatively favorable detection sensitivity of approximately 80%, since the incidence previously reported was approximately 75% after 30 days. The Pearson correlation coefficient between the manual and system results was 0.870. The estimated median duration was 92 days, whereas the previously reported median duration of peripheral neuropathy with paclitaxel was 727 days. The number of events detected in each document was highest in the physician’s progress notes, followed by the pharmacist’s and nursing records.
Conclusions:
Considering that the treatment of peripheral neuropathy is inherently costly because the patient condition must be constantly monitored, our system has a significant advantage in that it can immediately estimate the treatment duration. Although the results of onset time estimation were relatively accurate, the duration may be affected by the duration of data follow-up periods. The results suggest our method using various types of data can detect more ADEs in various types of documents.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.