当前位置: X-MOL 学术Stat. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multi-scale process modelling and distributed computation for spatial data
Statistics and Computing ( IF 2.2 ) Pub Date : 2020-07-16 , DOI: 10.1007/s11222-020-09962-6
Andrew Zammit-Mangion , Jonathan Rougier

Recent years have seen a huge development in spatial modelling and prediction methodology, driven by the increased availability of remote-sensing data and the reduced cost of distributed-processing technology. It is well known that modelling and prediction using infinite-dimensional process models is not possible with large data sets, and that both approximate models and, often, approximate-inference methods, are needed. The problem of fitting simple global spatial models to large data sets has been solved through the likes of multi-resolution approximations and nearest-neighbour techniques. Here we tackle the next challenge, that of fitting complex, nonstationary, multi-scale models to large data sets. We propose doing this through the use of superpositions of spatial processes with increasing spatial scale and increasing degrees of nonstationarity. Computation is facilitated through the use of Gaussian Markov random fields and parallel Markov chain Monte Carlo based on graph colouring. The resulting model allows for both distributed computing and distributed data. Importantly, it provides opportunities for genuine model and data scalability and yet is still able to borrow strength across large spatial scales. We illustrate a two-scale version on a data set of sea-surface temperature containing on the order of one million observations, and compare our approach to state-of-the-art spatial modelling and prediction methods.



中文翻译:

空间数据的多尺度过程建模和分布式计算

近年来,在遥感数据的可用性提高和分布式处理技术成本降低的推动下,空间建模和预测方法学取得了长足发展。众所周知,对于大数据集,使用无限维过程模型进行建模和预测是不可能的,并且既需要近似模型,通常也需要近似推断方法。通过多分辨率逼近和最近邻技术,解决了将简单的全局空间模型拟合到大型数据集的问题。在这里,我们解决下一个挑战,即将复杂的,非平稳的,多尺度的模型拟合到大数据集。我们建议通过使用空间过程的叠加来实现此目的,其中空间尺度的增加和非平稳程度的增加。通过使用基于图着色的高斯马尔可夫随机场和并行马尔可夫链蒙特卡洛,可以促进计算。结果模型允许进行分布式计算和分布式数据。重要的是,它为真正的模型和数据可伸缩性提供了机会,但仍然能够在较大的空间规模上借鉴实力。我们在海面温度数据集上举例说明了两级版本,其中包含大约一百万个观测值,并比较了我们对最新空间建模和预测方法的研究方法。通过使用基于图着色的高斯马尔可夫随机场和并行马尔可夫链蒙特卡洛,可以促进计算。结果模型允许进行分布式计算和分布式数据。重要的是,它为真正的模型和数据可伸缩性提供了机会,但仍然能够在较大的空间规模上借鉴实力。我们在海面温度数据集上举例说明了两级版本,其中包含大约一百万个观测值,并比较了我们对最新空间建模和预测方法的研究方法。通过使用基于图着色的高斯马尔可夫随机场和并行马尔可夫链蒙特卡洛,可以促进计算。结果模型允许进行分布式计算和分布式数据。重要的是,它为真正的模型和数据可伸缩性提供了机会,但仍然能够在较大的空间规模上借鉴实力。我们在海面温度数据集上举例说明了两级版本,其中包含大约一百万个观测值,并比较了我们对最新空间建模和预测方法的研究方法。

更新日期:2020-07-16
down
wechat
bug