当前位置: X-MOL 学术arXiv.cs.NA › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Interactive Change Point Detection using optimisation approach and Bayesian statistics applied to real world applications
arXiv - CS - Numerical Analysis Pub Date : 2021-06-17 , DOI: arxiv-2106.09691
Rebecca Gedda, Larisa Beilina, Ruomu Tan

Change point detection becomes more and more important as datasets increase in size, where unsupervised detection algorithms can help users process data. To detect change points, a number of unsupervised algorithms have been developed which are based on different principles. One approach is to define an optimisation problem and minimise a cost function along with a penalty function. In the optimisation approach, the choice of the cost function affects the predictions made by the algorithm. In extension to the existing studies, a new type of cost function using Tikhonov regularisation is introduced. Another approach uses Bayesian statistics to calculate the posterior probability distribution of a specific point being a change point. It uses a priori knowledge on the distance between consecutive change points and a likelihood function with information about the segments. The optimisation and Bayesian approaches for offline change point detection are studied and applied to simulated datasets as well as a real world multi-phase dataset. The approaches have previously been studied separately and a novelty lies in comparing the predictions made by the two approaches in a specific setting, consisting of simulated datasets and a real world example. The study has found that the performance of the change point detection algorithms are affected by the features in the data.

中文翻译:

使用优化方法和应用于现实世界应用程序的贝叶斯统计的交互式变化点检测

随着数据集规模的增加,变化点检测变得越来越重要,无监督检测算法可以帮助用户处理数据。为了检测变化点,已经开发了许多基于不同原理的无监督算法。一种方法是定义优化问题并最小化成本函数和惩罚函数。在优化方法中,成本函数的选择会影响算法做出的预测。作为对现有研究的扩展,引入了一种使用 Tikhonov 正则化的新型成本函数。另一种方法使用贝叶斯统计来计算作为变化点的特定点的后验概率分布。它使用关于连续变化点之间距离的先验知识和具有有关片段信息的似然函数。研究了离线变化点检测的优化和贝叶斯方法,并将其应用于模拟数据集以及真实世界的多阶段数据集。这些方法之前已经单独研究过,新颖之处在于比较两种方法在特定设置中所做的预测,包括模拟数据集和真实世界的例子。研究发现,变点检测算法的性能受数据特征的影响。这些方法之前已经单独研究过,新颖之处在于比较两种方法在特定设置中所做的预测,包括模拟数据集和真实世界的例子。研究发现,变点检测算法的性能受数据特征的影响。这些方法之前已经单独研究过,新颖之处在于比较两种方法在特定设置中所做的预测,包括模拟数据集和真实世界的例子。研究发现,变点检测算法的性能受数据特征的影响。
更新日期:2021-06-18
down
wechat
bug