Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Previously submitted to: Interactive Journal of Medical Research (no longer under consideration since Mar 19, 2019)

Date Submitted: Mar 15, 2019
Open Peer Review Period: Mar 17, 2019 - Mar 19, 2019
(closed for review but you can still tweet)

NOTE: This is an unreviewed Preprint

Warning: This is a unreviewed preprint (What is a preprint?). Readers are warned that the document has not been peer-reviewed by expert/patient reviewers or an academic editor, may contain misleading claims, and is likely to undergo changes before final publication, if accepted, or may have been rejected/withdrawn (a note "no longer under consideration" will appear above).

Peer review me: Readers with interest and expertise are encouraged to sign up as peer-reviewer, if the paper is within an open peer-review period (in this case, a "Peer Review Me" button to sign up as reviewer is displayed above). All preprints currently open for review are listed here. Outside of the formal open peer-review period we encourage you to tweet about the preprint.

Citation: Please cite this preprint only for review purposes or for grant applications and CVs (if you are the author).

Final version: If our system detects a final peer-reviewed "version of record" (VoR) published in any journal, a link to that VoR will appear below. Readers are then encourage to cite the VoR instead of this preprint.

Settings: If you are the author, you can login and change the preprint display settings, but the preprint URL/DOI is supposed to be stable and citable, so it should not be removed once posted.

Submit: To post your own preprint, simply submit to any JMIR journal, and choose the appropriate settings to expose your submitted version as preprint.

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

The performance of artificial neural network, logistic regression, and random forest models to predict breast cancer in patients with type 2 diabetes mellitus

  • Chia-Hung Kao

Background:

Breast cancer incidence may be higher among patients with type 2 diabetes mellitus (T2DM) compared with the general population. This study evaluated the performance of three models for predicting breast cancer risk in patients with T2DM.

Objective:

This study evaluated the performance of three models for predicting breast cancer risk in patients with T2DM.

Methods:

In total, 1,267,867 patients with newly diagnosed T2DM between 2000 and 2012 were identified from Taiwan National Health Insurance Research Database. By employing their data, we created prediction models for detecting an increased risk of subsequent breast cancer development in T2DM patients. The available potential risk factors for breast cancer were also collected for adjustment in the analyses. The Synthetic Minority Oversampling Technique (SMOTE) was used to augment data points in the minority class. Each data point was randomly allocated to the training and test sets at a ratio of approximate 39:1. The performance of artificial neural network (ANN), logistic regression (LR), and random forest (RF) models were determined using the recall, precision, F1 score, and area under receiver operating characteristic curve (AUC).

Results:

The AUCs of all three models were significantly higher than the area of 0.5 for the null hypothesis (0.959, 0.865, and 0.834 for RF, ANN, and LR models, respectively). The RF model has the largest AUC among all models; moreover, it had the highest values in all other metrics.

Conclusions:

Although all three models could accurately predict high breast cancer risk in patients with T2DM in Taiwan, the RF model demonstrated the best performance.

ClinicalTrial:

This is not a chinical trial.


 Citation

Please cite as:

Kao CH

The performance of artificial neural network, logistic regression, and random forest models to predict breast cancer in patients with type 2 diabetes mellitus

JMIR Preprints. 15/03/2019:14027

DOI: 10.2196/preprints.14027

URL: https://preprints.jmir.org/preprint/14027

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.