Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Oct 7, 2018
Open Peer Review Period: Oct 13, 2018 - Dec 1, 2018
Date Accepted: Jan 26, 2019
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

A Stroke Risk Detection: Improving Hybrid Feature Selection Method

Zhang Y, Zhou Y, Zhang D, Song W

A Stroke Risk Detection: Improving Hybrid Feature Selection Method

J Med Internet Res 2019;21(4):e12437

DOI: 10.2196/12437

PMID: 30938684

PMCID: 6466481

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

A Stroke Risk Detection: Improving Hybrid Feature Selection Method

  • Yonglai Zhang; 
  • Yaojian Zhou; 
  • Dongsong Zhang; 
  • Wenai Song

Background:

Stroke is one of the most common diseases that cause mortality. Detecting the risk of stroke for individuals is critical yet challenging because of a large number of risk factors for stroke.

Objective:

This study aimed to address the limitation of ineffective feature selection in existing research on stroke risk detection. We have proposed a new feature selection method called weighting- and ranking-based hybrid feature selection (WRHFS) to select important risk factors for detecting ischemic stroke.

Methods:

WRHFS integrates the strengths of various filter algorithms by following the principle of a wrapper approach. We employed a variety of filter-based feature selection models as the candidate set, including standard deviation, Pearson correlation coefficient, Fisher score, information gain, Relief algorithm, and chi-square test and used sensitivity, specificity, accuracy, and Youden index as performance metrics to evaluate the proposed method.

Results:

This study chose 792 samples from the electronic records of 13,421 patients in a community hospital. Each sample included 28 features (24 blood test features and 4 demographic features). The results of evaluation showed that the proposed method selected 9 important features out of the original 28 features and significantly outperformed baseline methods. Their cumulative contribution was 0.51. The WRHFS method achieved a sensitivity of 82.7% (329/398), specificity of 80.4% (317/394), classification accuracy of 81.5% (645/792), and Youden index of 0.63 using only the top 9 features. We have also presented a chart for visualizing the risk of having ischemic strokes.

Conclusions:

This study has proposed, developed, and evaluated a new feature selection method for identifying the most important features for building effective and parsimonious models for stroke risk detection. The findings of this research provide several novel research contributions and practical implications.


 Citation

Please cite as:

Zhang Y, Zhou Y, Zhang D, Song W

A Stroke Risk Detection: Improving Hybrid Feature Selection Method

J Med Internet Res 2019;21(4):e12437

DOI: 10.2196/12437

PMID: 30938684

PMCID: 6466481

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.