当前位置: X-MOL 学术Expert Syst. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Non-numerical nearest neighbor classifiers with value-object hierarchical embedding
Expert Systems with Applications ( IF 8.5 ) Pub Date : 2020-01-15 , DOI: 10.1016/j.eswa.2020.113206
Sheng Luo , Duoqian Miao , Zhifei Zhang , Zhihua Wei

Non-numerical classification plays an essential role in many real-world applications such as DNA analysis, recommendation systems and expert systems. The nearest neighbor classifier is one of the most popular and flexible models for performing classification tasks in these applications. However, due to the complexity of non-numerical data, existing nearest neighbor classifiers that use the overlap measure and its variants cannot capture the inherent ordered relationship and statistic information of non-numerical data. This phenomenon leads to the classification limitation of nearest neighbor classifiers in non-numerical data environments. To overcome this challenge, we propose a novel object distance metric, i.e., value-object hierarchical metric (VOHM), which is able to capture inherent ordered relationships within non-numerical data. Then, we construct two nearest neighbor classifiers, i.e., the value-object hierarchical embedded nearest neighbor classifier (VO-kNN) and the two-stage value-object hierarchical embedded nearest neighbor classifier (TSVO-kNN), which take advantages of both VOHM and non-numerical feature selection. Experiments show that both VO-kNN and TSVO-kNN could mine more knowledge from data and achieve better performance than state-of-the-art classifiers in non-numerical data environments.



中文翻译:

具有数值对象分层嵌入的非数值最近邻分类器

非数字分类在许多实际应用中(例如DNA分析,推荐系统和专家系统)起着至关重要的作用。最近邻分类器是在这些应用程序中执行分类任务的最流行,最灵活的模型之一。但是,由于非数值数据的复杂性,使用重叠度量及其变体的现有最近邻分类器无法捕获非数值数据的固有有序关系和统计信息。这种现象导致非数值数据环境中最近邻居分类器的分类限制。为了克服这一挑战,我们提出了一种新颖的对象距离度量,即值-对象层次度量(VOHM),它能够捕获非数值数据中的固有有序关系。然后,k NN)和两级值对象分层嵌入式最近邻分类器(TSVO- k NN),它们充分利用了VOHM和非数字特征选择的优势。实验表明,与非数值数据环境中的最新分类器相比,VO- k NN和TSVO- k NN都可以从数据中挖掘更多的知识,并获得更好的性能。

更新日期:2020-01-15
down
wechat
bug