当前位置: X-MOL 学术Pattern Recogn. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CARs-Lands: an Associative Classifier for Large-scale Datasets
Pattern Recognition ( IF 7.5 ) Pub Date : 2020-04-01 , DOI: 10.1016/j.patcog.2019.107128
Mehrdad Almasi , Mohammad Saniee Abadeh

Abstract Associative classifiers are one of the most efficient classifiers for large datasets. However, they are unsuitable to be directly used in large-scale data problems. Associative classifiers discover frequent/rare rules or both in order to produce an efficient classifier. Discovery rules need to explore a large solution space in a well-organized manner; hence, learning of the associative classification methods of large datasets is not suitable on large-scale datasets because of memory and time-complexity constraints. The proposed method, CARs-Lands, presents an efficient distributed associative classifier. In CARs-Lands, first, a modified dataset is generated. This new dataset has sub-datasets that are completely appropriate to produce classification association rules (CARs) in a parallel manner. The produced dataset by CARs-Lands contains two types of instances: main instances and neighbor instances. Main instances can be either real instances of training dataset or meta-instances, which are not in the training dataset; each main instance has several neighbor instances from the training dataset, which together form a sub-dataset. These sub-datasets are used for parallel local association rule mining. In CARs-Lands, local association rules lead to more accurate prediction, because each test instance is classified by the association rules of their nearest neighbors in the training datasets. The proposed approach is evaluated in terms of accuracy on six real-world large-scale datasets against five recent and well-known methods. Experiment results show that the proposed classification method has high prediction accuracy and is highly competitive when compared to other classification methods.

中文翻译:

CARs-Lands:大规模数据集的关联分类器

摘要 关联分类器是大数据集最有效的分类器之一。但是,它们不适合直接用于大规模数据问题。关联分类器发现频繁/罕见规则或两者兼而有之,以产生有效的分类器。发现规则需要有条理地探索一个大的解空间;因此,由于内存和时间复杂度的限制,大型数据集的关联分类方法的学习不适用于大型数据集。所提出的方法 CARs-Lands 提出了一种高效的分布式关联分类器。在 CARs-Lands 中,首先生成修改后的数据集。这个新数据集具有完全适合以并行方式生成分类关联规则 (CAR) 的子数据集。CARs-Lands 生成的数据集包含两种类型的实例:主实例和邻居实例。主实例可以是训练数据集的真实实例,也可以是不在训练数据集中的元实例;每个主实例都有来自训练数据集的几个邻居实例,它们一起形成一个子数据集。这些子数据集用于并行本地关联规则挖掘。在 CARs-Lands 中,局部关联规则导致更准确的预测,因为每个测试实例都根据训练数据集中其最近邻居的关联规则进行分类。所提出的方法根据六个真实世界的大规模数据集的准确性与五个最近和众所周知的方法进行评估。
更新日期:2020-04-01
down
wechat
bug