当前位置: X-MOL 学术Knowl. Inf. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Mining discriminative itemsets in data streams using the tilted-time window model
Knowledge and Information Systems ( IF 2.5 ) Pub Date : 2021-02-15 , DOI: 10.1007/s10115-021-01550-y
Majid Seyfi , Richi Nayak , Yue Xu , Shlomo Geva

A discriminative itemset is a frequent itemset in the target data stream with much higher frequency than that of the same itemset in the rest of the data streams in the dataset. The discriminative itemsets describe the distinguishing features between data streams. Mining discriminative itemsets in data streams is very important, where continuously arriving transactions can be inserted in fast speed and large volume. Compared with frequent itemset mining in single data stream, there are additional challenges in the discriminative itemset mining process as the Apriori property of subset is not applicable. We propose an efficient and high accurate method for mining discriminative itemsets in data streams using a tilted-time window model. The proposed single-pass H-DISSparse algorithm is designed particularly based on several well-defined characteristics aiming to improve the approximate frequencies of the itemsets in the tilted-time window model. The data structures are dynamically adjusted in offline time intervals to reflect the discriminative itemset frequencies in different time periods in unsynchronized data streams. Empirical analysis shows the efficient time and space complexity of the proposed method in the fast-growing big data streams.



中文翻译:

使用倾斜时间窗口模型挖掘数据流中的区分性项目集

区分性项目集是目标数据流中的频繁项目集,其频率比数据集中其余数据流中相同项目集的频率高得多。区分项集描述了数据流之间的区别特征。挖掘数据流中的区分性项目集非常重要,可以快速而大量地插入连续到达的事务。与单个数据流中的频繁项集挖掘相比,由于子集的Apriori属性不适用,因此在区分项集挖掘过程中还存在其他挑战。我们提出了一种高效且高精度的方法,用于使用倾斜时间窗口模型挖掘数据流中的区别项集。拟议的单程H-DISSparse该算法是专门基于几个明确定义的特征设计的,旨在提高倾斜时间窗口模型中各项的近似频率。数据结构在脱机时间间隔中进行动态调整,以反映非同步数据流中不同时间段的可区分项集频率。实证分析表明,该方法在快速增长的大数据流中具有高效的时空复杂性。

更新日期:2021-02-16
down
wechat
bug