当前位置: X-MOL 学术J. Comput. Graph. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Analytic Permutation Testing for Functional Data ANOVA
Journal of Computational and Graphical Statistics ( IF 2.4 ) Pub Date : 2022-05-26 , DOI: 10.1080/10618600.2022.2069780
Adam B Kashlak 1 , Sergii Myroshnychenko 2 , Susanna Spektor 3
Affiliation  

Abstract

Analysis of variance is a cornerstone of statistical hypothesis testing. When data lies beyond the assumption of univariate normality, nonparametric methods including rank based statistics and permutation tests are enlisted. The permutation test is a versatile exact nonparametric significance test that requires drastically fewer assumptions than similar parametric tests. The main downfall of the permutation test is high computational cost making this approach laborious for comparing multiple samples of complex data types and completely infeasible in any application requiring speedy results such as high throughput streaming data. We rectify this problem through application of concentration inequalities and thus propose a computation free permutation test—that is, a permutation-less permutation test. This general framework is applied to multivariate and matrix-valued, but with a special emphasis on functional data. We improve these concentration bounds via a novel incomplete beta transform. Our theory is extended from two-sample to k-sample testing through the use of weakly dependent Rademacher chaoses and modified decoupling inequalities. Our methodology is tested on classic functional datasets including the Berkeley growth curves and the phoneme dataset. We further analyze a novel dataset of 12 spoken vowel sounds that was collected to illustrate to power of the analytic permutation test. Supplementary materials for this article are available online.



中文翻译:

功能数据方差分析的分析排列测试

摘要

方差分析是统计假设检验的基石。当数据超出单变量正态性假设时,将采用非参数方法,包括基于等级的统计和排列检验。排列检验是一种通用的精确非参数显着性检验,与类似的参数检验相比,它需要的假设要少得多。排列测试的主要缺点是高计算成本,使得这种方法难以比较复杂数据类型的多个样本,并且在任何需要快速结果的应用程序(如高吞吐量流数据)中完全不可行。我们通过应用浓度不等式来纠正这个问题,从而提出了一种无计算排列测试——即无排列排列测试。这个通用框架适用于多变量和矩阵值,但特别强调功能数据。我们通过一种新颖的不完全 beta 变换改进了这些浓度界限。我们的理论从双样本扩展到k - 通过使用弱依赖 Rademacher 混沌和修改的解耦不等式进行样本测试。我们的方法在经典功能数据集上进行了测试,包括伯克利增长曲线和音素数据集。我们进一步分析了一个包含 12 个口语元音的新数据集,收集这些数据集是为了说明分析排列测试的功效。本文的补充材料可在线获取。

更新日期:2022-05-26
down
wechat
bug