当前位置: X-MOL 学术Sequ. Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Scan B-statistic for kernel change-point detection
Sequential Analysis ( IF 0.6 ) Pub Date : 2019-10-02 , DOI: 10.1080/07474946.2019.1686886
Shuang Li 1 , Yao Xie 1 , Hanjun Dai 2 , Le Song 2
Affiliation  

Abstract Detecting the emergence of an abrupt change-point is a classic problem in statistics and machine learning. Kernel-based nonparametric statistics have been used for this task, which enjoys fewer assumptions on the distributions than the parametric approach and can handle high-dimensional data. In this article, we focus on the scenario when the amount of background data is large and propose a computationally efficient kernel-based statistics for change-point detection, inspired by the recently developed B-statistics. A novel theoretical result of the article is the characterization of the tail probability of these statistics using the change-of-measure technique, which focuses on characterizing the tail of the detection statistics rather than obtaining its asymptotic distribution under the null distribution. Such approximations are crucial to controlling the false alarm rate, which corresponds to the average run length in online change-point detection. Our approximations are shown to be highly accurate. Thus, they provide a convenient way to find detection thresholds for online cases without the need to resort to the more expensive simulations. We show that our methods perform well on both synthetic data and real data.

中文翻译:

扫描 B 统计以进行内核变化点检测

摘要 检测突变点的出现是统计学和机器学习中的一个经典问题。已将基于内核的非参数统计用于此任务,与参数方法相比,它对分布的假设更少,并且可以处理高维数据。在本文中,我们关注背景数据量很大的场景,并受最近开发的 B 统计数据的启发,提出了一种计算效率高的基于内核的统计数据,用于变化点检测。这篇文章的一个新颖的理论结果是使用变化测量技术来表征这些统计量的尾部概率,该技术侧重于表征检测统计量的尾部,而不是获得其在零分布下的渐近分布。这种近似对于控制误报率至关重要,误报率对应于在线变化点检测中的平均运行长度。我们的近似值被证明是高度准确的。因此,它们提供了一种方便的方法来查找在线案例的检测阈值,而无需求助于更昂贵的模拟。我们表明我们的方法在合成数据和真实数据上都表现良好。
更新日期:2019-10-02
down
wechat
bug