当前位置: X-MOL 学术Qual. Reliab. Eng. Int. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A one‐class peeling method for multivariate outlier detection with applications in phase I SPC
Quality and Reliability Engineering International ( IF 2.3 ) Pub Date : 2020-02-26 , DOI: 10.1002/qre.2629
Waldyn G. Martinez 1 , Maria L. Weese 1 , L. Allison Jones-Farmer 1
Affiliation  

In phase I of statistical process control (SPC), control charts are often used as outlier detection methods to assess process stability. Many of these methods require estimation of the covariance matrix, are computationally infeasible, or have not been studied when the dimension of the data, p , is large. We propose the one‐class peeling (OCP) method, a flexible framework that combines statistical and machine learning methods to detect multiple outliers in multivariate data. The OCP method can be applied to phase I of SPC, does not require covariance estimation, and is well suited to high‐dimensional data sets with a high percentage of outliers. Our empirical evaluation suggests that the OCP method performs well in high dimensions and is computationally more efficient and robust than existing methodologies. We motivate and illustrate the use of the OCP method in a phase I SPC application on a N = 354 , p = 1917 dimensional data set containing Wikipedia search results for National Football League (NFL) players, teams, coaches, and managers. The example data set and R functions, OCP.R and OCPLimit.R, to compute the respective OCP distances and thresholds are available in the supplementary materials.

中文翻译:

一阶剥离方法用于多变量离群值检测及其在I期SPC中的应用

在统计过程控制(SPC)的第一阶段,通常将控制图用作离群值检测方法来评估过程稳定性。这些方法中的许多方法都需要估算协方差矩阵,在计算上不可行,或者在数据维数较大时尚未进行研究, p ,很大。我们提出了一类剥离(OCP)方法,这是一种灵活的框架,将统计方法和机器学习方法结合在一起,可以检测多元数据中的多个异常值。OCP方法可以应用于SPC的第一阶段,不需要协方差估计,非常适合异常值百分比很高的高维数据集。我们的经验评估表明,OCP方法在高维度上表现良好,并且在计算上比现有方法更为有效和健壮。我们激励并说明在IPC的I期SPC应用程序中使用OCP方法的情况。 ñ = 354 p = 1917年 包含Wikipedia针对国家橄榄球联盟(NFL)球员,球队,教练和经理的搜索结果的三维数据集。补充材料中提供了示例数据集和R函数OCP.ROCPLimit.R,用于计算相应的OCP距离和阈值。
更新日期:2020-02-26
down
wechat
bug