Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Cancer

Date Submitted: Feb 8, 2022
Date Accepted: Jun 23, 2022

The final, peer-reviewed published version of this preprint can be found here:

Using Shopping Data to Improve the Diagnosis of Ovarian Cancer: Computational Analysis of a Web-Based Survey

Dolan EH, Goulding J, Tata LJ, Lang AR

Using Shopping Data to Improve the Diagnosis of Ovarian Cancer: Computational Analysis of a Web-Based Survey

JMIR Cancer 2023;9:e37141

DOI: 10.2196/37141

PMID: 37000495

PMCID: 10131768

Using Shopping Data to Improve the Diagnosis of Ovarian Cancer: Survey Study

  • Elizabeth Helen Dolan; 
  • James Goulding; 
  • Laila J Tata; 
  • Alexandra R Lang

ABSTRACT

Background:

Shopping data can be analysed using machine learning techniques to study population health. It is unknown if use of such methods can successfully investigate pre-diagnosis purchases linked to self-medication of symptoms of ovarian cancer.

Objective:

To gain new domain knowledge from women’s experiences, to better understand how women’s shopping behaviour relates to their pathway to diagnosis of ovarian cancer, and to inform research on computational analysis of shopping data for insights into population health.

Methods:

An online survey about individuals’ shopping patterns occurring prior to an ovarian cancer diagnosis was analysed to identify key knowledge about healthcare purchases. Logistic regression and random forest models were employed to statistically examine how products linked to potential symptoms related to presentation to healthcare and timing of diagnosis.

Results:

Of 101 women surveyed with ovarian cancer 58% bought non-prescription healthcare products for up to more than a year prior to diagnosis, including pain relief and abdominal products. General Practitioner advice was the primary reason for purchases (40%), with 51% occurring due to misdiagnosis. Associations were shown between purchases made due to misdiagnosis and the following variables: health problems for longer than a year prior to diagnosis (OR 7.33; 95% CI 1.58 – 33.97), buying healthcare products for more than 6 months to a year (OR 3.82; 95% CI 1.04 – 13.98) or for more than a year (OR 7.64; 95% CI 1.38 – 42.33), and the amount of healthcare product types purchased (OR 1.54; 95% CI 1.13 - 2.11). Purchasing patterns are shown to be potentially predictive of misdiagnosis of women in the study, with nested cross-validation of random forest classification models achieving an average in-sample accuracy score of 89.1%, and 70.1% out-of-sample.

Conclusions:

The study indicates a delay to diagnosis of ovarian cancer is significantly associated with buying healthcare products due to a doctor’s misdiagnosis. Women in the survey who self-medicated were seven times more likely to wait longer than a year for an accurate diagnosis if their decision to self-medicate was made due to advice from a doctor, rather than independently. Results indicate that misdiagnosis could potentially be identified via analysis of women’s shopping behaviours; and highlight the need to investigate whether receiving advice from a doctor is disproportionately increasing the time women self-manage symptoms before re-seeking help, and leading to prolonged misdiagnosis.


 Citation

Please cite as:

Dolan EH, Goulding J, Tata LJ, Lang AR

Using Shopping Data to Improve the Diagnosis of Ovarian Cancer: Computational Analysis of a Web-Based Survey

JMIR Cancer 2023;9:e37141

DOI: 10.2196/37141

PMID: 37000495

PMCID: 10131768

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.