Abstract
In the field of machine learning and data mining, feature interaction is a ubiquitous issue that cannot be ignored and has attracted more attention in recent years. In this paper, we proposed the Symmetrical Complementary Coefficient which can quantify feature interactions very well. Based on it, we improved the Sequential Forward Selection (SFS) algorithm and proposed a new feature subset searching algorithm called SCom-SFS which only needs to consider the feature interactions between adjacent features on a given sequence instead of all of them. Moreover, discovered feature interactions can speed up the process of searching for the optimal feature subset. In addition, we have improved the ReliefF algorithm by screening out representative samples from the original data set, and need not to sample the samples. The improved ReliefF algorithm has been proved to be more efficient and reliable. An effective and complete feature selection algorithm RRSS is obtained through the combination of the two modified algorithms. According to the experimental results, the proposed algorithm RRSS outperformed five classic and two latest feature selection algorithms in terms of size of resulting feature subset, Accuracy, Kappa coefficient, and adjusted Mean-Square Error (MSE).
Similar content being viewed by others
References
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Cortez P, Silva AMG (2008) Using Data Mining to Predict Secondary School Student Performance. In: Brito A, Teixeira J (eds) Proceedings of 5th future business technology conference, pp 5–12
Dash M, Liu H (2003) Consistency-based search in feature selection. Artif Intell 151(1):155–176
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(1):1–30
Dheeru D, Karra Taniskidou E (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml
Estevez PA, Tesmer M, Perez CA, Zurada JM (2009) Normalized mutual information feature selection. IEEE Trans Neural Netw 20(2):189–201
Fleuret F (2004) Fast binary feature selection with conditional mutual information. J Mach Learn Res 5 (3):1531–1555
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701
Gao W, Hu L, Zhang P, He J (2018) Feature selection considering the composition of feature relevancy. Pattern Recognit Lett 112:70–74
Gonzalez-Abril L, Cuberos FJ, Velasco F, Ortega JA (2009) Ameva: an autonomous discretization algorithm. Expert Syst Appl 36(3):5327–5332
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3 (6):1157–1182
Hall MA (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the seventeenth international conference on machine learning, pp 359–366
Jakulin A, Bratko I (2003) Analyzing attribute dependencies. In: European conference on principles of data mining and knowledge discovery. Springer, pp 229–240
Jakulin A, Bratko I (2004) Testing the significance of attribute interactions. In: Proceedings of the 21st international conference on machine learning, pp 409–416
John GH, Kohavi R, Pfleger K (1994) Irrelevant features and the subset selection problem. In: Machine learning proceedings 1994. Elsevier, pp 121–129
Kira K, Rendell LA (1992) The feature selection problem: traditional methods and a new algorithm. In: Tenth national conference on artificial intelligence, pp 129–134
Koller D, Sahami M (1996) Toward optimal feature selection. In: Thirteenth international conference on international conference on machine learning, pp 284–292
Kononenko I (1994) Estimating attributes: analysis and extensions of relief. In: European conference on machine learning on machine learning, pp 171–182
Kursa MB, Jankowski A, Rudnicki WR (2010) Boruta—a system for feature selection. Fund Inform 101 (4):271–285
Liu H, Setiono R (1996) A probabilistic approach to feature selection—a filter solution. In: International conference on machine learning, pp 319–327
Nemenyi P (1963) Distribution-eree multiple comparison. PhD thesis
Ng AY (2004) Feature selection, L 1 vs. L 2 regularization, and rotational invariance. In: Proceedings of the twenty-first international conference on machine learning. ACM, p 78
Park H, Kwon HC (2008) Extended relief algorithms in instance-based feature filtering. In: International conference on advanced language processing and web information technology, pp 123–128
Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of relieff and rrelieff. Mach Learn 53(1–2):23–69
Shieh MD, Yang CC (2008) Multiclass SVM-RFE for product form feature selection. Expert Syst Appl 35 (1):531–541
Song L, Smola A, Gretton A, Bedo J, Borgwardt K (2012) Feature selection via dependence maximization. J Mach Learn Res 1(1):1393–1434
Strobl C, Boulesteix AL, Augustin T (2007) Unbiased split selection for classification trees based on the gini index. Comput Stat Data Anal 52(1):483–501
Su YX, Fu Y, Li X (2007) A feature selection method based on relieff evaluation and complementary coefficient. Electron Opt Control 14(3):12–15
Tang X, Dai Y, Xiang Y (2019) Feature selection based on feature interactions with application to text categorization. Expert Syst Appl 120:207–216
Tuv E, Borisov A, Runger G, Torkkola K (2009) Feature selection with ensembles, artificial variables, and redundancy elimination. J Mach Learn Res 10(3):1341–1366
Wang G, Song Q (2012) Selecting feature subset via constraint association rules. In: Pacific-Asia conference on advances in knowledge discovery and data mining, pp 304–321
Wang H, Lo SH, Zheng T, Hu I (2012) Interaction-based feature selection and classification for high-dimensional biological data. Bioinformatics 28(21):2834–2842
Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Twentieth international conference on international conference on machine learning, pp 856–863
Yu L, Liu H (2004) Efficient feature selection via analysis of relevance and redundancy. J Mach Learn Res 5(12):1205–1224
Zeng Z, Zhang H, Zhang R, Yin C (2015) A novel feature selection method considering feature interaction. Pattern Recogn 48(8):2656–2666
Zhao Z, Liu H (2009) Searching for interacting features in subset selection. Intell Data Anal 13(2):207–228
Acknowledgements
Thanks to the data sets provided by the UCI repository. And The breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data. The Statlog-Vehicle data set was from the Turing Institute, Glasgow, Scotland. Also thanks to R language and the authors of different packages.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: Symmetrical complementary coefficients and thresholds of the remaining seven data sets
Appendix A: Symmetrical complementary coefficients and thresholds of the remaining seven data sets
Rights and permissions
About this article
Cite this article
Zhang, R., Zhang, Z. Feature selection with Symmetrical Complementary Coefficient for quantifying feature interactions. Appl Intell 50, 101–118 (2020). https://doi.org/10.1007/s10489-019-01518-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-019-01518-0