Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Jul 6, 2020
Date Accepted: Nov 5, 2020

The final, peer-reviewed published version of this preprint can be found here:

Lifelog Data-Based Prediction Model of Digital Health Care App Customer Churn: Retrospective Observational Study

Kwon H, Kim HH, An JI, Lee JH, Park YR

Lifelog Data-Based Prediction Model of Digital Health Care App Customer Churn: Retrospective Observational Study

J Med Internet Res 2021;23(1):e22184

DOI: 10.2196/22184

PMID: 33404511

PMCID: 7817354

Development of a Prediction Model for Customer Churn with Life-log Data for Digital Healthcare Applications: A Retrospective Observational Study

  • Hongwook Kwon; 
  • Ho Heon Kim; 
  • Jae Il An; 
  • Jae-Ho Lee; 
  • Yu Rang Park

ABSTRACT

Background:

In digital healthcare, user churn prediction is important not only in terms of revenue for a company but also for the improvement of the health of users. Churn prediction has been studied in many past studies, but most of them applied time-invariant model structures and primarily used structured data. However, an increasing amount of unstructured data has become available, and it became necessary to process daily time-series log data in churn prediction.

Objective:

The purpose of this study is to apply a recurrent neural network structure to accept time-series patterns using life-log data and text message data to predict the churn of digital healthcare users.

Methods:

This study was based on a digital healthcare application that provides the functions of food, exercise, and weight logging, and interactive messages with human coaches. Among the users in Korea enrolled between January 1, 2017 and January 1, 2019, we defined churn users according to the following criteria: 1) users who received a refund before the paid program ended; and 2) users who received a refund after 7 days of the trial period. We used LSTM with a masking layer to receive sequence data of different lengths. We also carried out topic modeling to vectorize text messages. To interpret the contributions of each variable to the predictions of the model, we used integrated gradients, which is an attribution method

Results:

A total of 1,868 eligible users were included in this study. The final classification performance of churn prediction was 0.89 (F1-score), and the score decreased by 0.12 when the data of the final week were excluded (0.77, F1-score). In addition, when text data were included, the predicted performance increased by approximately 0.085 (F1-score) on average at every time point. As for the contribution of each variable, the number of steps per day had the largest contribution (0.1085, contribution on model output), and among the topic variables, topic about bad habits (e.g., drinking, overeating, and late-night eating) showed the largest contribution (0.0875).

Conclusions:

The model with recurrent neural network architecture that uses user log data and message data demonstrates high performance in churn classification. In addition, the contribution analysis of variables is expected to help identify signs of user churn in advance and improve the compliance rate in digital healthcare.


 Citation

Please cite as:

Kwon H, Kim HH, An JI, Lee JH, Park YR

Lifelog Data-Based Prediction Model of Digital Health Care App Customer Churn: Retrospective Observational Study

J Med Internet Res 2021;23(1):e22184

DOI: 10.2196/22184

PMID: 33404511

PMCID: 7817354

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.