当前位置: X-MOL 学术J. Geogr. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Spatial autocorrelation for massive spatial data: verification of efficiency and statistical power asymptotics
Journal of Geographical Systems ( IF 2.417 ) Pub Date : 2019-03-25 , DOI: 10.1007/s10109-019-00293-3
Qing Luo , Daniel A. Griffith , Huayi Wu

Being a hot topic in recent years, many studies have been conducted with spatial data containing massive numbers of observations. Because initial developments for classical spatial autocorrelation statistics are based on rather small sample sizes, in the context of massive spatial datasets, this paper presents extensions to efficiency and statistical power comparisons between the Moran coefficient and the Geary ratio for different variable distribution assumptions and selected geographic neighborhood definitions. The question addressed asks whether or not earlier results for small n extend to large and massively large n, especially for non-normal variables; implications established are relevant to big spatial data. To achieve these comparisons, this paper summarizes proofs of limiting variances, also called asymptotic variances, to do the efficiency analysis, and derives the relationship function between the two statistics to compare their statistical power at the same scale. Visualization of this statistical power analysis employs an alternative technique that already appears in the literature, furnishing additional understanding and clarity about these spatial autocorrelation statistics. Results include: the Moran coefficient is more efficient than the Geary ratio for most surface partitionings, because this index has a relatively smaller asymptotic as well as exact variance, and the superior power of the Moran coefficient vis-à-vis the Geary ratio for positive spatial autocorrelation depends upon the type of geographic configuration, with this power approaching one as sample sizes become increasingly large. Because spatial analysts usually calculate these two statistics for interval/ration data, this paper also includes comments about the join count statistics used for nominal data.

中文翻译:

海量空间数据的空间自相关:效率验证和统计幂渐近性

作为近年来的热门话题,已经对包含大量观测数据的空间数据进行了许多研究。由于古典空间自相关统计的最初发展是基于相当小的样本量,因此在海量空间数据集的背景下,本文介绍了针对不同变量分布假设和选定地理区域的Moran系数和Geary比之间的效率和统计功效比较的扩展。邻里定义。解决的问题是,小n的早期结果是否扩展到大n和大n,尤其是对于非正态变量;建立的含义与大空间数据有关。为了实现这些比较,本文总结了极限方差的证明(也称为渐近方差)以进行效率分析,并推导了两个统计量之间的关系函数,以比较它们在相同规模下的统计功效。这种统计功效分析的可视化采用了一种已在文献中出现的替代技术,从而使人们对这些空间自相关统计有了更多的理解和清晰。结果包括:大多数表面分区的Moran系数比Geary比率更有效,因为该指数具有相对较小的渐近性和精确方差,正态空间自相关的Moran系数相对于Geary比率的优越功效取决于地理结构的类型,随着样本数量的增加,这一功效逐渐接近1。由于空间分析人员通常会计算间隔/比率数据的这两个统计量,因此本文还包含有关用于名义数据的连接计数统计量的注释。
更新日期:2019-03-25
down
wechat
bug