当前位置: X-MOL 学术Random Matrices Theory Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A setwise EWMA scheme for monitoring high-dimensional datastreams
Random Matrices: Theory and Applications ( IF 0.9 ) Pub Date : 2019-04-04 , DOI: 10.1142/s2010326320500045
Long Feng 1 , Haojie Ren 2 , Changliang Zou 2
Affiliation  

The monitoring of high-dimensional data streams has become increasingly important for real-time detection of abnormal activities in many statistical process control (SPC) applications. Although the multivariate SPC has been extensively studied in the literature, the challenges associated with designing a practical monitoring scheme for high-dimensional processes when between-streams correlation exists are yet to be addressed well. Classical [Formula: see text]-test-based schemes do not work well because the contamination bias in estimating the covariance matrix grows rapidly with the increase of dimension. We propose a test statistic which is based on the “divide-and-conquer” strategy, and integrate this statistic into the multivariate exponentially weighted moving average charting scheme for Phase II process monitoring. The key idea is to calculate the [Formula: see text] statistics on low-dimensional sub-vectors and to combine them together. The proposed procedure is essentially distribution-free and computation efficient. The control limit is obtained through the asymptotic distribution of the test statistic under some mild conditions on the dependence structure of stream observations. Our asymptotic results also shed light on quantifying the size of a reference sample required. Both theoretical analysis and numerical results show that the proposed method is able to control the false alarm rate and deliver robust change detection.

中文翻译:

一种用于监控高维数据流的集合 EWMA 方案

在许多统计过程控制 (SPC) 应用程序中,对高维数据流的监控对于实时检测异常活动变得越来越重要。尽管多变量 SPC 已在文献中进行了广泛研究,但当存在流间相关性时,与设计用于高维过程的实用监控方案相关的挑战尚未得到很好的解决。经典的[公式:见正文]-基于测试的方案效果不佳,因为估计协方差矩阵时的污染偏差随着维数的增加而迅速增长。我们提出了一种基于“分而治之”策略的检验统计量,并将该统计量集成到用于第二阶段过程监控的多元指数加权移动平均图表方案中。关键思想是计算低维子向量的[公式:见正文]统计数据并将它们组合在一起。所提出的过程本质上是无分布且计算高效的。控制限是通过检验统计量在某些温和条件下对流观测的依赖结构的渐近分布得到的。我们的渐近结果也有助于量化所需参考样本的大小。理论分析和数值结果都表明,该方法能够控制误报率并提供鲁棒的变化检测。控制限是通过检验统计量在某些温和条件下对流观测的依赖结构的渐近分布得到的。我们的渐近结果也有助于量化所需参考样本的大小。理论分析和数值结果都表明,该方法能够控制误报率并提供鲁棒的变化检测。控制限是通过检验统计量在某些温和条件下对流观测的依赖结构的渐近分布得到的。我们的渐近结果也有助于量化所需参考样本的大小。理论分析和数值结果都表明,所提出的方法能够控制误报率并提供鲁棒的变化检测。
更新日期:2019-04-04
down
wechat
bug