当前位置: X-MOL 学术Appl. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Mining top- k frequent patterns from uncertain databases
Applied Intelligence ( IF 5.3 ) Pub Date : 2020-01-23 , DOI: 10.1007/s10489-019-01622-1
Tuong Le , Bay Vo , Van-Nam Huynh , Ngoc Thanh Nguyen , Sung Wook Baik

Mining uncertain frequent patterns (UFPs) from uncertain databases was recently introduced, and there are various approaches to solve this problem in the last decade. However, systems are often faced with the problem of too many UFPs being discovered by the traditional approaches to this issue, and thus will spend a lot of time and resources to rank and find the most promising patterns. Therefore, this paper introduces a task named mining top-k UFPs from uncertain databases. We then propose an efficient method named TUFP (mining Top-k UFPs) to carry this out. Effective threshold raising strategies are introduced to help the proposed algorithm reduce the number of generated candidates to enhance the performance in terms of the runtime as well as memory usage. Finally, several experiments on the number of generated candidates, mining time, memory usage and scalability of TUFP and two state-of-the-art approaches (CUFP-mine and LUNA) were conducted. The performance studies show that TUFP is efficient in terms of mining time, memory usage and scalability for mining top-k UFPs.



中文翻译:

从不确定的数据库中挖掘前k个频繁模式

最近引入了从不确定的数据库中挖掘不确定的频繁模式(UFP)的方法,并且在过去的十年中有多种方法可以解决此问题。但是,系统通常会面临用传统方法来解决这一问题的问题,因为有很多UFP被发现,因此将花费大量时间和资源来排名和找到最有前途的模式。因此,本文介绍了一项任务,即从不确定的数据库中挖掘前k个UFP。然后,我们提出了一种名为TUFP(挖掘Top- kUFP)来执行此操作。引入了有效的阈值提高策略,以帮助所提出的算法减少生成的候选对象的数量,从而在运行时以及内存使用方面提高性能。最后,对生成的候选对象的数量,挖掘时间,内存使用率和TUFP的可伸缩性以及两种最新方法(CUFP-mine和LUNA)进行了一些实验。性能研究表明,对于挖掘前k个UFP ,TUFP在挖掘时间,内存使用和可伸缩性方面都是有效的。

更新日期:2020-04-20
down
wechat
bug