当前位置: X-MOL 学术VLDB J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
In-order sliding-window aggregation in worst-case constant time
The VLDB Journal ( IF 4.2 ) Pub Date : 2021-06-03 , DOI: 10.1007/s00778-021-00668-3
Kanat Tangwongsan , Martin Hirzel , Scott Schneider

Sliding-window aggregation is a widely-used approach for extracting insights from the most recent portion of a data stream. While aggregations of interest can usually be expressed as binary operators that are associative, they are not necessarily commutative nor invertible. Non-invertible operators, however, are difficult to support efficiently. DABA is the first algorithm for sliding-window aggregation with worst-case constant time. Prior to DABA, the best published algorithms would require \(O(\log n)\) aggregation steps per window operation for a window of size n—and while for strictly in-order streams, this bound could be improved to O(1) aggregation steps in the amortized sense, it was not known how to achieve an O(1) bound in the worst case, which is critical for latency-sensitive applications. In this article, besides describing DABA in more detail, we introduce a new variant, DABA Lite, which achieves the same time bounds in less memory. Whereas DABA requires space for storing 2n partial aggregates, DABA Lite only requires space for \(n+2\) partial aggregates. Our experiments on synthetic and real data support the theoretical findings.



中文翻译:

最坏情况恒定时间内的有序滑动窗口聚合

滑动窗口聚合是一种广泛使用的方法,用于从数据流的最新部分中提取见解。虽然感兴趣的聚合通常可以表示为关联的二元运算符,但它们不一定是可交换的或可逆的。然而,不可逆算子难以有效支持。DABA 是第一个具有最坏情况恒定时间的滑动窗口聚合算法。在 DABA 之前,最好的发布算法对于大小为n的窗口,每个窗口操作需要\(O(\log n)\) 个聚合步骤——而对于严格按顺序的流,这个界限可以改进为O (1 ) 摊销意义上的聚合步骤,不知道如何实现O(1) 最坏情况下的界​​限,这对于延迟敏感的应用程序至关重要。在本文中,除了更详细地描述 DABA 之外,我们还介绍了一个新的变体 DABA Lite,它以更少的内存实现了相同的时间界限。DABA 需要空间来存储 2 n 个部分聚合,而 DABA Lite 只需要空间来存储\(n+2\) 个部分聚合。我们对合成数据和真实数据的实验支持了理论发现。

更新日期:2021-06-03
down
wechat
bug