当前位置: X-MOL 学术New Gener. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Mining High-Average Utility Itemsets with Positive and Negative External Utilities
New Generation Computing ( IF 2.0 ) Pub Date : 2019-11-07 , DOI: 10.1007/s00354-019-00078-8
Irfan Yildirim , Mete Celik

High-utility itemset mining (HUIM) is an emerging data mining topic. It aims to find the high-utility itemsets by considering both the internal (i.e., quantity) and external (i.e., profit) utilities of items. High-average-utility itemset mining (HAUIM) is an extension of the HUIM, which provides a more fair measurement named average-utility, by taking into account the length of itemsets in addition to their utilities. In the literature, several algorithms have been introduced for mining high-average-utility itemsets (HAUIs). However, these algorithms assume that databases contain only positive utilities. For some real-world applications, on the other hand, databases may also contain negative utilities. In such databases, the proposed algorithms for HAUIM may not discover the complete set of HAUIs since they are designed for only positive utilities. In this study, to discover the correct and complete set of HAUIs with both positive and negative utilities, an algorithm named MHAUIPNU (mining high-average-utility itemsets with positive and negative utilities) is proposed. MHAUIPNU introduces an upper bound model, three pruning strategies, and a data structure. Experimental results show that MHAUIPNU is very efficient in reducing the size of the search space and thus in mining HAUIs with negative utilities.

中文翻译:

挖掘具有正负外部效用的高平均效用项集

高效项目集挖掘(HUIM)是一个新兴的数据挖掘主题。它旨在通过考虑项目的内部(即数量)和外部(即利润)效用来找到高效用项集。高平均效用项集挖掘 (HAUIM) 是 HUIM 的扩展,它提供了一种名为平均效用的更公平的度量,通过考虑项集的长度以及它们的效用。在文献中,已经引入了几种算法来挖掘高平均效用项集(HAUI)。然而,这些算法假定数据库只包含正效用。另一方面,对于某些实际应用程序,数据库也可能包含负面实用程序。在这样的数据库中,所提出的 HAUIM 算法可能无法发现完整的 HAUI 集,因为它们仅针对正效用而设计。在这项研究中,为了发现具有正效用和负效用的正确和完整的 HAUI 集,提出了一种名为 MHAUIPNU(挖掘具有正效用和负效用的高平均效用项集)的算法。MHAUIPNU 引入了一个上限模型、三个剪枝策略和一个数据结构。实验结果表明,MHAUIPNU 在减少搜索空间的大小方面非常有效,因此在挖掘具有负效用的 HAUI 方面非常有效。
更新日期:2019-11-07
down
wechat
bug