Paper Title
Breast Cancer Prediction using Multilayer Ensemble of SVMS with Genetic Algorithm based Feature Selections
Abstract
In this study, we introduced a novel two-phase model for Breast cancer diagnosis. In the first phase, we selected relevant features using the Genetic algorithm and ten-fold cross-validation error rate of SVM classifier as a fitness function. The Genetic algorithm minimizes this ten-fold cross-validation error rate. In the second phase, we applied the Bagging based ensemble of linear Support vector machines. In this phase, training of each SVM classifier was done individually in parallel using the randomly (with replacement) chosen samples from the training dataset through a bootstrap technique (Bagging). These SVMs were aggregated using another higher-level linear Support vector machine. This two layered approach of individual aggregating models is called double-layer hierarchical combining. For validation of the model, the Breast cancer dataset was obtained from UCI machine learning repository WDBC (Breast Cancer Wisconsin (Diagnostic)). With this two-layer architecture and feature selection technique, we got a promising accuracy (98.42%), Specificity (99.44%), and sensitivity (96.7%) using 10 fold cross-validation. The performance analysis also points out that this method outperforms other SVM methods (available in the literature) on WDBC Data Set.
Keywords - Bagging, Ensemble Classifier, Feature Selection, Genetic Algorithm, Optimization, Stacked Generalization, Support Vector Machine