当前位置: X-MOL 学术Pattern Anal. Applic. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Smooth estimates of multiple quantiles in dynamically varying data streams
Pattern Analysis and Applications ( IF 3.7 ) Pub Date : 2019-04-09 , DOI: 10.1007/s10044-019-00794-3
Hugo Lewi Hammer , Anis Yazidi

In this paper, we investigate the problem of estimating multiple quantiles when samples are received online (data stream). We assume a dynamical system, i.e., the distribution of the samples from the data stream changes with time. A major challenge of using incremental quantile estimators to track multiple quantiles is that we are not guaranteed that the monotone property of quantiles will be satisfied, i.e, an estimate of a lower quantile might erroneously overpass that of a higher quantile estimate. Surprisingly, we have only found two papers in the literature that attempt to counter these challenges, namely the works of Cao et al. (Proceedings of the first ACM workshop on mobile internet through cellular networks, ACM, 2009) and Hammer and Yazidi (Proceedings of the 30th international conference on industrial engineering and other applications of applied intelligent systems (IEA/AIE), France, Springer, 2017) where the latter is a preliminary version of the work in this paper. Furthermore, the state-of-the-art incremental quantile estimator called deterministic update-based multiplicative incremental quantile estimator (DUMIQE), due to Yazidi and Hammer (IEEE Trans Cybernet, 2017), fails to guarantee the monotone property when estimating multiple quantiles. A challenge with the solutions, in Cao et al. (2009) and Hammer and Yazidi (2017), is that even though the estimates satisfy the monotone property of quantiles, the estimates can be highly irregular relative to each other which usually is unrealistic from a practical point of view. In this paper, we suggest to generate the quantile estimates by inserting the quantile probabilities (e.g., \(0.1, 0.2, \ldots , 0.9\)) into a monotonically increasing and infinitely smooth function (can be differentiated infinitely many times). The function is incrementally updated from the data stream. The monotonicity and smoothness of the function ensure that both the monotone property and regularity requirement of the quantile estimates are satisfied. The experimental results show that the method performs very well and estimates multiple quantiles more precisely than the original DUMIQE (Yazidi and Hammer 2017), and the approaches reported in Hammer and Yazidi (2017) and Cao et al. (2009).

中文翻译:

动态变化数据流中多个分位数的平滑估计

在本文中,我们研究了在线接收样本(数据流)时估计多个分位数的问题。我们假设一个动态系统,即数据流中样本的分布随时间变化。使用增量分位数估计量来跟踪多个分位数的主要挑战是,我们不能保证将满足分位数的单调特性,即,较低分位数的估计可能会错误地超过较高分位数的估计。令人惊讶的是,我们在文献中仅发现了两篇试图应对这些挑战的论文,即曹等人的著作。(第一届ACM通过蜂窝网络进行的移动互联网研讨会,ACM,2009年)和Hammer和Yazidi(第30届工业工程和应用智能系统的其他应用国际会议(IEA / AIE),法国,施普林格,2017年会议论文集),后者是本文的初步版本。此外,由于Yazidi和Hammer(IEEE Trans Cyber​​net,2017年),最先进的增量式分位数估计器称为基于确定性更新的乘法增量式分位数估计器(DUMIQE)在估计多个分位数时无法保证单调性。曹等人的解决方案面临挑战。(2009年)以及Hammer和Yazidi(2017年)的观点是,即使估计值满足分位数的单调性质,但估计值彼此之间也可能是高度不规则的,从实践的角度来看通常是不现实的。在本文中,\(0.1,0.2,\ ldots,0.9 \))成为单调递增且无限平滑的函数(可以无限次微分)。该功能从数据流中逐步更新。函数的单调性和平滑性确保分位数估计的单调性和规则性要求都得到满足。实验结果表明,与原始的DUMIQE(Yazidi和Hammer 2017)以及Hammer和Yazidi(2017)和Cao等人报道的方法相比,该方法性能很好,并且可以更精确地估计多个分位数。(2009)。
更新日期:2019-04-09
down
wechat
bug