当前位置: X-MOL 学术Complexity › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient Utility Tree-Based Algorithm to Mine High Utility Patterns Having Strong Correlation
Complexity ( IF 1.7 ) Pub Date : 2021-07-28 , DOI: 10.1155/2021/7310137
Rashad Saeed 1, 2 , Azhar Rauf 1 , Fahmi H. Quradaa 1, 2 , Syed Muhammad Asim 3
Affiliation  

High Utility Itemset Mining (HUIM) is one of the most investigated tasks of data mining. It has broad applications in domains such as product recommendation, market basket analysis, e-learning, text mining, bioinformatics, and web click stream analysis. Insights from such pattern analysis provide numerous benefits, including cost cutting, improved competitive advantage, and increased revenue. However, HUIM methods may discover misleading patterns as they do not evaluate the correlation of extracted patterns. As a consequence, a number of algorithms have been proposed to mine correlated HUIs. These algorithms still suffer from the issue of the computational cost in terms of both time and memory consumption. This paper presents an algorithm, named Efficient Correlated High Utility Pattern Mining (ECoHUPM), to efficiently mine the high utility patterns having strong correlation items. A new data structure based on utility tree (UTtree) named CoUTlist is proposed to store sufficient information for mining the desired patterns. Three pruning properties are introduced to reduce the search space and improve the mining performance. Experiments on sparse, very sparse, dense, and very dense datasets indicate that the proposed ECoHUPM algorithm is efficient as compared to the state-of-the-art CoHUIM and CoHUI-Miner algorithms in terms of both time and memory consumption.

中文翻译:

基于高效效用树的算法挖掘具有强相关性的高效用模式

高效用项集挖掘 (HUIM) 是数据挖掘中研究最多的任务之一。它在产品推荐、购物篮分析、电子学习、文本挖掘、生物信息学和网络点击流分析等领域有着广泛的应用。从这种模式分析中获得的见解提供了许多好处,包括削减成本、提高竞争优势和增加收入。然而,HUIM 方法可能会发现误导性模式,因为它们不评估提取模式的相关性。因此,已经提出了许多算法来挖掘相关的 HUI。这些算法仍然存在时间和内存消耗方面的计算成本问题。本文提出了一种名为高效相关高效用模式挖掘 (ECoHUPM) 的算法,有效挖掘具有强相关项的高效用模式。提出了一种基于效用树(UTtree)的名为 CoUTlist 的新数据结构来存储足够的信息以挖掘所需的模式。引入了三个剪枝属性以减少搜索空间并提高挖掘性能。在稀疏、非常稀疏、密集和非常密集的数据集上的实验表明,与最先进的 CoHUIM 和 CoHUI-Miner 算法相比,所提出的 ECoHUPM 算法在时间和内存消耗方面是有效的。
更新日期:2021-07-28
down
wechat
bug