当前位置: X-MOL 学术Expert Syst. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An efficient henry gas solubility optimization for feature selection
Expert Systems with Applications ( IF 8.5 ) Pub Date : 2020-03-06 , DOI: 10.1016/j.eswa.2020.113364
Nabil Neggaz , Essam H. Houssein , Kashif Hussain

In classification, regression, and other data mining applications, feature selection (FS) is an important pre-process step which helps avoid advert effect of noisy, misleading, and inconsistent features on the model performance. Formulating it into a global combinatorial optimization problem, researchers have employed metaheuristic algorithms for selecting the prominent features to simplify and enhance the quality of the high-dimensional datasets, in order to devise efficient knowledge extraction systems. However, when employed on datasets with extensively large feature-size, these methods often suffer from local optimality problem due to considerably large solution space. In this study, we propose a novel approach to dimensionality reduction by using Henry gas solubility optimization (HGSO) algorithm for selecting significant features, to enhance the classification accuracy. By employing several datasets with wide range of feature size, from small to massive, the proposed method is evaluated against well-known metaheuristic algorithms including grasshopper optimization algorithm (GOA), whale optimization algorithm (WOA), dragonfly algorithm (DA), grey wolf optimizer (GWO), salp swarm algorithm (SSA), and others from recent relevant literature. We used k-nearest neighbor (k-NN) and support vector machine (SVM) as expert systems to evaluate the selected feature-set. Wilcoxon’s ranksum non-parametric statistical test was carried out at 5% significance level to judge whether the results of the proposed algorithms differ from those of the other compared algorithms in a statistically significant way. Overall, the empirical analysis suggests that the proposed approach is significantly effective on low, as well as, considerably high dimensional datasets, by producing 100% accuracy on classification problems with more than 11,000 features.



中文翻译:

高效的亨利气体溶解度优化,用于特征选择

在分类,回归和其他数据挖掘应用程序中,特征选择(FS)是重要的预处理步骤,有助于避免嘈杂,误导和不一致的特征对模型性能的不利影响。为了将其表达为一个全局组合优化问题,研究人员采用了元启发式算法来选择突出的特征,以简化和提高高维数据集的质量,从而设计出有效的知识提取系统。但是,当在具有很大特征量的数据集上使用这些方法时,由于极大的求解空间,这些方法通常会遇到局部最优性问题。在这项研究中,我们提出了一种通过使用亨利气体溶解度优化(HGSO)算法来选择重要特征的降维新方法,提高分类的准确性。通过使用具有从小到大的各种特征尺寸范围的多个数据集,针对著名的元启发式算法(包括蚱optimization优化算法(GOA),鲸鱼优化算法(WOA),蜻蜓算法(DA),灰狼)评估该方法。优化器(GWO),群体算法(SSA)以及最近相关文献中的其他内容。我们用了k-最近邻(k -NN)和支持向量机(SVM)作为专家系统来评估所选特征集。Wilcoxon的ranksum非参数统计检验在5%的显着性水平上进行,以判断所提出算法的结果是否与其他比较算法的结果具有统计学上的显着性差异。总体而言,经验分析表明,通过对包含11,000多个特征的分类问题产生100%的准确性,该方法在低维和相当高的维数据集上均有效。

更新日期:2020-03-06
down
wechat
bug