当前位置: X-MOL 学术Earth Sci. Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A two-level storage strategy for map-reduce enabled computation of local map algebra
Earth Science Informatics ( IF 2.8 ) Pub Date : 2020-02-29 , DOI: 10.1007/s12145-020-00452-x
Jianbo Zhang , Simin Zhou , Tingnan Liang , Yongchang Li , Caikun Chen , Hao Xia

In the big data era, high-resolution raster-based geocomputation has been widely employed in geospatial studies. The algorithms used in local map algebra operations are data-intensive and require a large memory space and massive computing power. Simply employing distributed computing framework such as Hadoop to serve such applications incurs storage and performance issues. In this paper, we present a two-level storage strategy specially for map-reduce implementation of local map algebra algorithms under Hadoop. This approach implements efficient storage and manipulation of large raster data sets through three processes: (1) partitioning a raster file into square tile sets, (2) compressing and reorganizing these tile sets to prevent tile overlap across data divisions, and (3) improving MapReduce’s I/O interfaces for data exchange of parallel computation of map algebra. Experiments with real-world datasets show that the proposed strategy can achieve high speedup and efficiency for raster-based spatial analysis applications. The results also show that the strategy has satisfactory scalability as the number of data nodes in clusters or the raster data volume is increased.

中文翻译:

用于简化地图的代数计算的两级存储策略

在大数据时代,基于高分辨率栅格的地理计算已广泛应用于地理空间研究中。局部地图代数运算中使用的算法是数据密集型的,并且需要较大的存储空间和强大的计算能力。简单地采用诸如Hadoop之类的分布式计算框架来服务于此类应用程序会导致存储和性能问题。在本文中,我们提出了一种两级存储策略,专门用于Hadoop下本地地图代数算法的地图简化实现。此方法通过三个过程实现了大型栅格数据集的有效存储和操作:(1)将栅格文件划分为方形图块集;(2)压缩和重组这些图块集,以防止图块在数据分区之间重叠;(3)改进MapReduce的I / O接口,以实现地图代数并行计算的数据交换。真实数据集的实验表明,该方法可以为基于栅格的空间分析应用程序实现较高的速度和效率。结果还表明,随着群集中数据节点的数量或栅格数据量的增加,该策略具有令人满意的可伸缩性。
更新日期:2020-02-29
down
wechat
bug