当前位置: X-MOL 学术ACM Trans. Intell. Syst. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
RHUPS
ACM Transactions on Intelligent Systems and Technology ( IF 5 ) Pub Date : 2021-01-13 , DOI: 10.1145/3430767
Yoonji Baek 1 , Unil Yun 1 , Heonho Kim 1 , Hyoju Nam 1 , Hyunsoo Kim 1 , Jerry Chun-Wei Lin 2 , Bay Vo 3 , Witold Pedrycz 4
Affiliation  

Databases that deal with the real world have various characteristics. New data is continuously inserted over time without limiting the length of the database, and a variety of information about the items constituting the database is contained. Recently generated data has a greater influence than the previously generated data. These are called the time-sensitive non-binary stream databases, and they include databases such as web-server click data, market sales data, data from sensor networks, and network traffic measurement. Many high utility pattern mining and stream pattern mining methods have been proposed so far. However, they have a limitation that they are not suitable to analyze these databases, because they find valid patterns by analyzing a database with only some of the features described above. Therefore, knowledge-based software about how to find meaningful information efficiently by analyzing databases with these characteristics is required. In this article, we propose an intelligent information system that calculates the influence of the insertion time of each batch in a large-scale stream database by applying the sliding window model and mines recent high utility patterns without generating candidate patterns. In addition, a novel list-based data structure is suggested for a fast and efficient management of the time-sensitive stream databases. Moreover, our technique is compared with state-of-the-art algorithms through various experiments using real datasets and synthetic datasets. The experimental results show that our approach outperforms the previously proposed methods in terms of runtime, memory usage, and scalability.

中文翻译:

RHUPS

处理现实世界的数据库具有各种特征。新数据随着时间的推移不断插入,不限制数据库的长度,包含了关于构成数据库的各项的各种信息。最近生成的数据比以前生成的数据具有更大的影响。这些被称为时间敏感的非二进制流数据库,它们包括诸如网络服务器点击数据、市场销售数据、来自传感器网络的数据和网络流量测量等数据库。迄今为止,已经提出了许多高效用模式挖掘和流模式挖掘方法。但是,它们有一个限制,即它们不适合分析这些数据库,因为它们通过分析仅具有上述某些特征的数据库来找到有效模式。所以,需要基于知识的软件,了解如何通过分析具有这些特征的数据库有效地找到有意义的信息。在本文中,我们提出了一种智能信息系统,该系统通过应用滑动窗口模型计算每个批次的插入时间对大规模流数据库的影响,并在不生成候选模式的情况下挖掘最近的高效用模式。此外,提出了一种新颖的基于列表的数据结构,用于快速有效地管理时间敏感的流数据库。此外,我们的技术通过使用真实数据集和合成数据集的各种实验与最先进的算法进行了比较。实验结果表明,我们的方法在运行时、内存使用和可扩展性方面优于先前提出的方法。
更新日期:2021-01-13
down
wechat
bug