当前位置: X-MOL 学术Expert Syst. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Semantic-k-NN algorithm: An enhanced version of traditional k-NN algorithm
Expert Systems with Applications ( IF 8.5 ) Pub Date : 2020-03-13 , DOI: 10.1016/j.eswa.2020.113374
Munwar Ali , Low Tang Jung , Abdel-Haleem Abdel-Aty , Mustapha Y. Abubakar , Mohamed Elhoseny , Irfan Ali

The k-NN algorithm is one of the most renowned ML algorithms widely used in the area of data classification research. With the emergence of big data, the performance and the efficiency of the traditional k-NN algorithm is fast becoming a critical issue. The traditional k-NN algorithm is inefficient to solve the high volume multi-categorical training datasets Traditional k-NN algorithm has a constraint in filtering the training dataset to yield training data that are most relevant to the intended or the targeted test dataset/file. It has to scan through all the training datasets categories to classify the intended/targeted data. As such, traditional k-NN is considered not intelligent and consequently is suffering poor accuracy performance with high computational complexity. A Semantic-kNN (Sk-NN) algorithm for ML is thus proposed in this paper to address the limitations in the traditional k-NN. The proposed Sk-NN deploys a process by leveraging on the semantic itemization and bigram model to filter the training dataset in accordance with the relevant information engaged in the test dataset. It is aimed for general security applications such as finding (the confidentiality level of the data when the algorithm is trained with multiple training categories during the data classification phase. Ultimately, Sk-NN is to elevate the ML performance in pattern extraction and labeling in the big data context.



中文翻译:

语义k-NN算法:传统k-NN算法的增强版本

k-NN算法是在数据分类研究领域广泛使用的最著名的ML算法之一。随着大数据的出现,传统的k-NN算法的性能和效率正在迅速成为一个关键问题。传统的k-NN算法无法有效地解决大量的多类别训练数据集。传统的k-NN算法在过滤训练数据集以产生与预期或目标测试数据集/文件最相关的训练数据方面存在约束。它必须扫描所有训练数据集类别以对预期/目标数据进行分类。因此,传统的k-NN被认为是不智能的,因此精度差且计算复杂度高。因此,本文提出了一种针对机器学习的语义kNN(Sk-NN)算法,以解决传统k-NN的局限性。所提出的Sk-NN通过利用语义项和bigram模型来部署过程,以根据参与测试数据集的相关信息来过滤训练数据集。它旨在用于一般的安全应用程序,例如查找(当在数据分类阶段对算法进行多种训练类别进行训练时,数据的机密性级别。最终,Sk-NN旨在提高机器学习中模式提取和标记中的机器学习性能。大数据环境。所提出的Sk-NN通过利用语义项目和bigram模型来部署过程,以根据测试数据集中涉及的相关信息来过滤训练数据集。它旨在用于一般的安全应用程序,例如查找(当在数据分类阶段对算法进行多种训练类别进行训练时,数据的机密性级别。最终,Sk-NN旨在提高机器学习中模式提取和标记中的机器学习性能。大数据环境。所提出的Sk-NN通过利用语义项目和bigram模型来部署过程,以根据测试数据集中涉及的相关信息来过滤训练数据集。它旨在用于一般的安全应用程序,例如查找(当在数据分类阶段对算法进行多种训练类别进行训练时,数据的机密性级别。最终,Sk-NN旨在提高机器学习中模式提取和标记中的机器学习性能。大数据环境。

更新日期:2020-03-13
down
wechat
bug