A unified data‐adaptive framework for high dimensional change point detection,The Journal of the Royal Statistical Society, Series B (Statistical Methodology)

当前位置： X-MOL 学术 › J. R. Stat. Soc. B › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A unified data‐adaptive framework for high dimensional change point detection
The Journal of the Royal Statistical Society, Series B (Statistical Methodology) ( IF 3.1 ) Pub Date : 2020-06-12 , DOI: 10.1111/rssb.12375
Bin Liu ₁ , Cheng Zhou ₂ , Xinsheng Zhang ₁ , Yufeng Liu ₃

Affiliation

In recent years, change point detection for a high dimensional data sequence has become increasingly important in many scientific fields such as biology and finance. The existing literature develops a variety of methods designed for either a specified parameter (e.g. the mean or covariance) or a particular alternative pattern (sparse or dense), but not for both scenarios simultaneously. To overcome this limitation, we provide a general framework for developing tests that are suitable for a large class of parameters, and also adaptive to various alternative scenarios. In particular, by generalizing the classical cumulative sum statistic, we construct the U‐statistic‐based cumulative sum matrix

C

. Two cases corresponding to common or different change point locations across the components are considered. We then propose two types of individual test statistics by aggregating

C

on the basis of the adjusted L_p‐norm with p ∈ {1,…,∞}. Combining the corresponding individual tests, we construct two types of data‐adaptive tests for the two cases, which are both powerful under various alternative patterns. A multiplier bootstrap method is introduced for approximating the proposed test statistics’ limiting distributions. With flexible dependence structure across co‐ordinates and mild moment conditions, we show the optimality of our methods theoretically in terms of size and power by allowing the dimension d and the number of parameters q to be much larger than the sample size n . An R package called AdaptiveCpt is developed to implement our algorithms. Extensive simulation studies provide further support for our theory. An application to a comparative genomic hybridization data set also demonstrates the usefulness of our proposed methods.

中文翻译：

用于高维变化点检测的统一数据自适应框架

近年来，在诸如生物学和金融学等许多科学领域中，高维数据序列的变化点检测变得越来越重要。现有文献开发了针对指定参数（例如均值或协方差）或特定替代模式（稀疏或密集）而设计的多种方法，但并非同时针对这两种情况。为了克服这个限制，我们提供了一个通用的框架来开发适合大型参数的测试，并且还可以适应各种替代方案。特别是，通过推广经典的累积和统计量，我们构造了基于U统计量的累积和矩阵

C

。考虑了对应于组件中相同或不同更改点位置的两种情况。然后，我们通过汇总提出两种类型的个人测试统计信息

C

根据调整后的L _p范数为 p∈{1，…，∞}。结合相应的单个测试，我们针对这两种情况构造了两种类型的数据自适应测试，它们在各种替代模式下均具有强大的功能。引入了一种乘法器自举方法，用于近似估计建议的测试统计量的极限分布。通过在坐标和温和矩条件下灵活的依赖关系结构，我们通过允许维度d和参数数量q远大于样本大小n来从大小和功效方面论证了我们方法的最优性。开发了一个名为AdaptiveCpt的R包来实现我们的算法。广泛的仿真研究为我们的理论提供了进一步的支持。在比较基因组杂交数据集上的应用也证明了我们提出的方法的有用性。

更新日期：2020-08-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文