当前位置: X-MOL 学术J. Ambient Intell. Human. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Binary Jaya algorithm based on binary similarity measure for feature selection
Journal of Ambient Intelligence and Humanized Computing ( IF 3.662 ) Pub Date : 2021-05-12 , DOI: 10.1007/s12652-021-03226-5
Abhilasha Chaudhuri , Tirath Prasad Sahu

Feature selection (FS) has become an indispensable data preprocessing task because of the huge amount of high dimensional data being generated by current technologies. These high dimensional data contains irrelevant, redundant, and noisy features that deteriorate classification accuracy. FS reduces dimensionality by removing the unwanted features thus improves classification accuracy. FS can be considered as a binary optimization problem. In order to solve this problem, this work proposes a new wrapper feature selection technique based on the Jaya algorithm. Three binary variants of the Jaya algorithm are proposed, the first and second ones are based on transfer functions namely BJaya-S and BJaya-V. The third variant (BJaya-JS) explores the search space on the basis of the Jaccard Similarity index. In addition, a probability-based local search technique, namely Neighbourhood Search is proposed to balance the exploration and exploitation. The variants of Jaya algorithm are evaluated and the best variant is selected. The best variant is further compared with six state-of-the-art feature selection techniques. All the performances are tested on 18 high dimensional standard UCI datasets. Experimental result comparison shows that the proposed feature selection technique performs better than other competitors.



中文翻译:

基于二进制相似度度量的Binary Jaya算法用于特征选择

由于当前技术正在产生大量的高维数据,因此特征选择(FS)已成为必不可少的数据预处理任务。这些高维数据包含不相关的,冗余的和嘈杂的特征,这些特征会降低分类的准确性。FS通过去除不需要的特征来降低尺寸,从而提高分类精度。FS可以看作是二进制优化问题。为了解决这个问题,这项工作提出了一种新的基于Jaya算法的包装特征选择技术。提出了Jaya算法的三个二进制变体,第一个和第二个基于传递函数,即BJaya-S和BJaya-V。第三个变体(BJaya-JS)在Jaccard相似性索引的基础上探索了搜索空间。此外,提出了一种基于概率的局部搜索技术,即邻域搜索,以平衡勘探和开发。评估Jaya算法的变体,并选择最佳变体。进一步将最佳变体与六种最先进的特征选择技术进行比较。所有性能都在18个高维标准UCI数据集上进行了测试。实验结果比较表明,所提出的特征选择技术的性能优于其他竞争者。

更新日期:2021-05-12
down
wechat
bug