Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Public Health and Surveillance

Date Submitted: Apr 3, 2021
Open Peer Review Period: Mar 30, 2021 - Apr 11, 2021
Date Accepted: May 19, 2021
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Incorporating Unstructured Patient Narratives and Health Insurance Claims Data in Pharmacovigilance: Natural Language Processing Analysis of Patient-Generated Texts About Systemic Lupus Erythematosus

Matsuda S, Ohtomo T, Tomizawa S, Miyano Y, Mogi M, Kuriki H, Nakayama T, Watanabe S

Incorporating Unstructured Patient Narratives and Health Insurance Claims Data in Pharmacovigilance: Natural Language Processing Analysis of Patient-Generated Texts About Systemic Lupus Erythematosus

JMIR Public Health Surveill 2021;7(6):e29238

DOI: 10.2196/29238

PMID: 34255719

PMCID: 8278300

Combined Analysis of Unstructured Patient Narratives and Health Insurance Claims Data in Pharmacovigilance: Analysis of Systemic Lupus Erythematosus

  • Shinichi Matsuda; 
  • Takumi Ohtomo; 
  • Shiho Tomizawa; 
  • Yuki Miyano; 
  • Miwako Mogi; 
  • Hiroshi Kuriki; 
  • Terumi Nakayama; 
  • Shinichi Watanabe

ABSTRACT

Background:

Gaining insights from patients has become an important topic in pharmacovigilance because it would reveal new findings that cannot be obtained from healthcare databases.

Objective:

Our objective was to show a use case of incorporating patient-generated data in pharmacovigilance: understanding the epidemiology and burden of illness in Japanese patients with systemic lupus erythematosus (SLE).

Methods:

Using two independent databases, we looked at data on SLE, an autoimmune disease that substantially impairs quality of life. To understand the epidemiology of SLE, we analyzed a Japanese health insurance claims database. To understand the SLE burden, we analyzed textual data collected from Japanese disease blogs (tōbyōki) written by patients with SLE. Natural language processing was applied to these texts to identify frequent patient-level complaints. Term frequency-inverse document frequency was used to explore patients' burden during treatment. We explored the health-related quality of life based on patient descriptions.

Results:

We analyzed 4,694 and 635 patients with SLE in the healthcare database and the tōbyōki blogs, respectively. Analysis of the healthcare database showed the prevalence of SLE and the treatment details. Textual data analysis showed that pain-related words became more important after starting treatment, suggesting patients' experienced burden. We also found a continuous increase in patients’ references to mobility and self-care over time, which indicated increased attention to physical disability due to disease progression.

Conclusions:

A classical medical database represents only a part of a patient's entire treatment experience, and analysis using such a database alone cannot infer patient-level symptoms or patient concerns about treatments. This study showed that analysis of tōbyōki blogs could add information on patient-level details for advancing patient-centric pharmacovigilance.


 Citation

Please cite as:

Matsuda S, Ohtomo T, Tomizawa S, Miyano Y, Mogi M, Kuriki H, Nakayama T, Watanabe S

Incorporating Unstructured Patient Narratives and Health Insurance Claims Data in Pharmacovigilance: Natural Language Processing Analysis of Patient-Generated Texts About Systemic Lupus Erythematosus

JMIR Public Health Surveill 2021;7(6):e29238

DOI: 10.2196/29238

PMID: 34255719

PMCID: 8278300

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.