当前位置: X-MOL 学术Data Knowl. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Integrating Cuckoo search-Grey wolf optimization and Correlative Naive Bayes classifier with Map Reduce model for big data classification
Data & Knowledge Engineering ( IF 2.7 ) Pub Date : 2019-12-27 , DOI: 10.1016/j.datak.2019.101788
Chitrakant Banchhor , N. Srinivasu

Big data is progressively being used in various areas, such as industry, financial dealing, medicine, and so on, as it can handle the challenges in processing large amounts of data. One of the data mining techniques used widely and effectively to classify big data is the MapReduce model. In this paper, an approach for the classification of big data is developed using Cuckoo–Grey wolf based Correlative Naive Bayes classifier and MapReduce Model (CGCNB-MRM). Accordingly, a novel classifier, named Cuckoo–Grey wolf based Correlative Naive Bayes classifier (CG-CNB), is designed by modifying CNB classifier with a newly developed optimization algorithm, Cuckoo–Grey Wolf based Optimization (CGWO). CGWO algorithm is designed by the effective integration of Cuckoo Search (CS) Algorithm into Grey Wolf Optimizer (GWO), to optimize the CNB model by the optimal selection of the model parameters. Finally, the proposed CGCNB-MRM approach performs the classification for each data samples based on the probability index table and the posterior probability of the data. Three metrics, such as accuracy, sensitivity, and specificity, are utilized for the performance evaluation of the proposed CGCNB-MRM approach, where it could achieve 80.7% accuracy with 84.5% sensitivity and 76.9% specificity and thus, prove its effectiveness in big data classification.



中文翻译:

将布谷鸟搜索-灰太狼优化和相关的朴素贝叶斯分类器与Map Reduce模型集成以进行大数据分类

大数据正在逐步处理各种领域,例如工业,金融交易,医药等,因为它可以应对处理大量数据的挑战。MapReduce模型是广泛有效地对大数据进行分类的数据挖掘技术之一。在本文中,使用基于Cuckoo–Grey Wolf的相关朴素贝叶斯分类器和MapReduce模型(CGCNB-MRM),开发了一种大数据分类方法。因此,通过使用新开发的优化算法(基于杜鹃-灰狼的最优化)(CGWO)修改CNB分类器,设计了一种新颖的分类器,命名为基于杜鹃-灰太狼的相关朴素贝叶斯分类器(CG-CNB)。CGWO算法是通过将杜鹃搜索(CS)算法有效集成到Gray Wolf Optimizer(GWO)中而设计的,通过最佳选择模型参数来优化CNB模型。最后,提出的CGCNB-MRM方法基于概率索引表和数据的后验概率对每个数据样本进行分类。准确性,敏感性和特异性这三个指标被用于所提出的CGCNB-MRM方法的性能评估,该方法可以达到80.7%的准确度,84.5%的敏感性和76.9%的特异性,从而证明了其在大数据中的有效性。分类。

更新日期:2019-12-27
down
wechat
bug