Collaborative Accelerators for Streamlining MapReduce on Scale-up Machines with Incremental Data Aggregation,IEEE Transactions on Computers

当前位置： X-MOL 学术 › IEEE Trans. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Collaborative Accelerators for Streamlining MapReduce on Scale-up Machines with Incremental Data Aggregation
IEEE Transactions on Computers ( IF 3.6 ) Pub Date : 2020-08-01 , DOI: 10.1109/tc.2020.3004169
Abraham Addisie , Valeria Bertacco

The MapReduce programming paradigm has been increasingly adopted to implement data-intensive applications processing both small and large scale datasets. As most jobs in data centers have a data footprint in the order of gigabytes, emerging high-end scale-up machines are capable of running most data center processing tasks, thus significantly improving power and server density. However, this approach provides limited performance and energy efficiency because of inefficient utilization of the memory subsystem and serial execution within the MapReduce programming model. Recent work has proposed a distributed hardware acceleration architecture, called CASM, which augments each core in a scale-up machine with a lightweight compute engine. The CASM's network of accelerators operates concurrently with the cores in executing MapReduce stages and reduces significantly traffic to/from storage. In this article, we study the benefits and applicability of CASM, by offering an extensive analysis of design parameters and of its scalable performance on a wide range of applications, and exploring its applicability to incremental data aggregation tasks. Our experimental evaluation indicates that CASM reduces off-chip traffic by four times on average over a chip multiprocessor solution, while scaling well with the number of cores in the system, and it is highly effective in providing incremental results that approximate final outcomes.

中文翻译：

用于在具有增量数据聚合的纵向扩展机器上简化 MapReduce 的协作加速器

MapReduce 编程范式已越来越多地用于实现处理小型和大型数据集的数据密集型应用程序。由于数据中心中的大多数工作的数据足迹都在千兆字节的数量级，新兴的高端纵向扩展机器能够运行大多数数据中心处理任务，从而显着提高功率和服务器密度。然而，由于内存子系统的低效利用和 MapReduce 编程模型中的串行执行，这种方法提供的性能和能源效率有限。最近的工作提出了一种称为 CASM 的分布式硬件加速架构，该架构使用轻量级计算引擎来增强纵向扩展机器中的每个内核。CASM' s 加速器网络在执行 MapReduce 阶段时与内核同时运行，并显着减少进出存储的流量。在本文中，我们通过对设计参数及其在各种应用程序中的可扩展性能进行广泛分析，并探索其对增量数据聚合任务的适用性，来研究 CASM 的优势和适用性。我们的实验评估表明，与芯片多处理器解决方案相比，CASM 将片外流量平均减少了四倍，同时随着系统中内核的数量扩展良好，并且在提供接近最终结果的增量结果方面非常有效。通过对设计参数及其在广泛应用中的可扩展性能进行广泛分析，并探索其对增量数据聚合任务的适用性。我们的实验评估表明，与芯片多处理器解决方案相比，CASM 将片外流量平均减少了四倍，同时随着系统中内核的数量扩展良好，并且在提供接近最终结果的增量结果方面非常有效。通过对设计参数及其在广泛应用中的可扩展性能进行广泛分析，并探索其对增量数据聚合任务的适用性。我们的实验评估表明，与芯片多处理器解决方案相比，CASM 将片外流量平均减少了四倍，同时随着系统中内核的数量扩展良好，并且在提供接近最终结果的增量结果方面非常有效。

更新日期：2020-08-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11