当前位置: X-MOL 学术Inform. Fusion › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient closed high-utility pattern fusion model in large-scale databases
Information Fusion ( IF 14.7 ) Pub Date : 2021-05-29 , DOI: 10.1016/j.inffus.2021.05.011
Jerry Chun-Wei Lin , Youcef Djenouri , Gautam Srivastava

High-Utility Itemset Mining (HUIM) is considered a major issue in recent decades since it reveals profit strategies for use in industry for decision-making. Most existing works have focused on mining high-utility itemsets from databases showing large amount of patterns; however exact decisions are still challenging to make from that large amounts of discovered knowledge. Closed High-utility itemset mining (CHUIM) provides a smart way to present concise high-utility itemsets that can be more effective for making correct decisions. However, none of the existing works have focused on handling large-scale databases to integrate discovered knowledge from several distributed databases. In this paper, we first present a large-scale information fusion architecture to integrate discovered closed high-utility patterns from several distributed databases. The generic composite model is used to cluster transactions regarding their relevant correlation that can ensure correctness and completeness of the fusion model. The well-known MapReduce framework is then deployed in the developed DFM-Miner algorithm to handle big datasets for information fusion and integration. Experiments are then compared to the state-of-the-art CHUI-Miner and CLS-Miner algorithms for mining closed high-utility patterns and the results indicated that the designed model is well designed for handling large-scale databases with less memory usage. Moreover, the designed MapReduce framework can speed up the mining performance of closed high-utility patterns in the developed fusion system.



中文翻译:

大规模数据库中的高效封闭高效模式融合模型

High-Utility Itemset Mining (HUIM) 被认为是近几十年来的一个主要问题,因为它揭示了用于工业决策的利润策略。大多数现有工作都集中在从显示大量模式的数据库中挖掘高效项目集;然而,从大量已发现的知识中做出准确的决定仍然具有挑战性。封闭的高效能项集挖掘 (CHUIM) 提供了一种智能方式来呈现简洁的高效能项集,可以更有效地做出正确的决策。然而,现有的工作都没有专注于处理大规模数据库以集成来自多个分布式数据库的发现知识。在本文中,我们首先提出了一种大规模信息融合架构,以集成从多个分布式数据库中发现的封闭高效用模式。通用复合模型用于对事务的相关关联进行聚类,以确保融合模型的正确性和完整性。然后将著名的 MapReduce 框架部署在开发的 DFM-Miner 算法中,以处理用于信息融合和集成的大数据集。然后将实验与最先进的 CHUI-Miner 和 CLS-Miner 算法进行比较,以挖掘封闭的高效用模式,结果表明设计的模型非常适合处理具有较少内存使用的大规模数据库。此外,设计的 MapReduce 框架可以加快开发的融合系统中封闭的高效用模式的挖掘性能。

更新日期:2021-06-04
down
wechat
bug