Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Cancer

Date Submitted: Feb 11, 2020
Date Accepted: Jun 18, 2020
Date Submitted to PubMed: Sep 24, 2020

The final, peer-reviewed published version of this preprint can be found here:

Incorporating Breast Cancer Recurrence Events Into Population-Based Cancer Registries Using Medical Claims: Cohort Study

A'mar T, Beatty JD, Fedorenko C, Markowitz D, Corey T, Lange J, Schwartz S, Huang B, Chubak J, Etzioni R

Incorporating Breast Cancer Recurrence Events Into Population-Based Cancer Registries Using Medical Claims: Cohort Study

JMIR Cancer 2020;6(2):e18143

DOI: 10.2196/18143

PMID: 32804084

PMCID: 7459434

Incorporating Breast Cancer Recurrence Events into Population-based Cancer Registries using Medical Claims

  • Teresa A'mar; 
  • J. David Beatty; 
  • Catherine Fedorenko; 
  • Daniel Markowitz; 
  • Thomas Corey; 
  • Jane Lange; 
  • Stephen Schwartz; 
  • Bin Huang; 
  • Jessica Chubak; 
  • Ruth Etzioni

ABSTRACT

Background:

There is a need for automated, scalable approaches to incorporate information on cancer recurrence events into population-based cancer registries.

Objective:

We aim to develop a new statistical learning algorithm to predict second breast cancer event (SBCE) occurrence and timing using cancer information registry linked with medical claims among women with localized breast cancer diagnosed in the Puget Sound SEER cancer registry (CSS) and treated at Kaiser Permanente Washington (KPWA), formerly Group Health. Since statistical learning algorithms that use only a single tree generally often exhibit suboptimal predictive performance, we sought to improve performance and increase the efficiency of claims-based recurrence identification for population-based cancer registries.

Methods:

We used supervised data from 3,092 stage I and II breast cancer cases (number of recurrences = 394), diagnosed between 1993 and 2006 inclusive, who were patients at Kaiser Permanente Washington and cases in the Puget Sound Cancer Surveillance System (CSS). Our goal was to classify each month after primary treatment as pre- versus post-SBCE. The prediction feature set for a given month consisted of registry variables on disease and patient characteristics related to the primary breast cancer event, as well as features based on monthly counts of diagnosis and procedure codes for the current, prior, and future months. A month was classified as post-SBCE if the predicted probability exceeded a probability threshold (PT); the predicted time of the SBCE was taken to be the month of maximum increase in the predicted probability between adjacent months.

Results:

The Kaplan–Meier net probability of SBCE was 0.25 at 14 years. The month-level ROC curve on test data (20% of the dataset) had an area under the curve of 0.986. The person-level predictions (at a monthly PT of 0.5) had sensitivity=0.89, specificity=0.98, PPV=0.85 and NPV=0.98. Corresponding median difference between the observed and predicted months of recurrence was 0 and mean difference was 0.04 months.

Conclusions:

Data mining of medical claims holds promise of streamlining cancer registry operations to feasibly collect information about second breast cancer events. Clinical Trial: Not applicable


 Citation

Please cite as:

A'mar T, Beatty JD, Fedorenko C, Markowitz D, Corey T, Lange J, Schwartz S, Huang B, Chubak J, Etzioni R

Incorporating Breast Cancer Recurrence Events Into Population-Based Cancer Registries Using Medical Claims: Cohort Study

JMIR Cancer 2020;6(2):e18143

DOI: 10.2196/18143

PMID: 32804084

PMCID: 7459434

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.