当前位置: X-MOL 学术Phys. Rev. Research › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Assessing the significance of directed and multivariate measures of linear dependence between time series
Physical Review Research Pub Date : 2021-02-12 , DOI: 10.1103/physrevresearch.3.013145
Oliver M. Cliff , Leonardo Novelli , Ben D. Fulcher , James M. Shine , Joseph T. Lizier

Inferring linear dependence between time series is central to our understanding of natural and artificial systems. Unfortunately, the hypothesis tests that are used to determine statistically significant directed or multivariate relationships from time-series data often yield spurious associations (Type I errors) or omit causal relationships (Type II errors). This is due to the autocorrelation present in the analyzed time series—a property that is ubiquitous across diverse applications, from brain dynamics to climate change. Here we show that, for limited data, this issue cannot be mediated by fitting a time-series model alone (e.g., in Granger causality or prewhitening approaches), and instead that the degrees of freedom in statistical tests should be altered to account for the effective sample size induced by cross-correlations in the observations. This insight enabled us to derive modified hypothesis tests for any multivariate correlation-based measures of linear dependence between covariance-stationary time series, including Granger causality and mutual information with Gaussian marginals. We use both numerical simulations (generated by autoregressive models and digital filtering) as well as recorded fMRI-neuroimaging data to show that our tests are unbiased for a variety of stationary time series. Our experiments demonstrate that the commonly used F- and χ2-tests can induce significant false-positive rates of up to 100% for both measures, with and without prewhitening of the signals. These findings suggest that many dependencies reported in the scientific literature may have been, and may continue to be, spuriously reported or missed if modified hypothesis tests are not used when analyzing time series.

中文翻译:

评估时间序列之间线性相关的有向和多元测量的重要性

推断时间序列之间的线性相关性是理解自然系统和人工系统的关键。不幸的是,用于从时序数据中确定具有统计意义的有向或多变量关系的假设检验通常会产生虚假关联(I型错误)或忽略因果关系(II型错误)。这是由于所分析的时间序列中存在自相关关系-该属性在从大脑动力学到气候变化的各种应用中无处不在。在这里,我们表明,对于有限的数据,不能仅通过拟合时间序列模型(例如,在格兰杰因果关系或预增白方法中)来解决此问题,取而代之的是,应更改统计检验的自由度,以考虑观察中互相关引起的有效样本量。这种洞察力使我们能够针对协方差平稳时间序列之间的线性相关性的任何基于多元相关性的测度得出修正的假设检验,包括格兰杰因果关系和具有高斯边际的互信息。我们使用数值模拟(由自回归模型和数字滤波生成)以及已记录的fMRI神经影像数据来表明,我们的测试在各种固定时间序列中都是无偏的。我们的实验表明,常用的 包括格兰杰因果关系和与高斯边际的相互信息。我们使用数值模拟(由自回归模型和数字滤波生成)以及已记录的fMRI神经影像数据来证明我们的测试在各种固定时间序列上均无偏倚。我们的实验表明,常用的 包括格兰杰因果关系和与高斯边际的相互信息。我们使用数值模拟(由自回归模型和数字滤波生成)以及已记录的fMRI神经影像数据来表明,我们的测试在各种固定时间序列中都是无偏的。我们的实验表明,常用的F- 和 χ2-测试可以在不预加信号白化的情况下,在两种措施下均导致高达100%的明显假阳性率。这些发现表明,如果在分析时间序列时不使用修改过的假设检验,则科学文献中报告的许多依赖关系可能已经并且可能会被虚假地报告或遗漏。
更新日期:2021-02-12
down
wechat
bug