当前位置: X-MOL 学术Int. J. Parallel. Program › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
RDMA-Based Apache Storm for High-Performance Stream Data Processing
International Journal of Parallel Programming ( IF 1.5 ) Pub Date : 2021-03-18 , DOI: 10.1007/s10766-021-00696-0
Ziyu Zhang , Zitan Liu , Qingcai Jiang , Junshi Chen , Hong An

Apache Storm is a scalable fault-tolerant distributed real time stream-processing framework widely used in big data applications. For distributed data-sensitive applications, low-latency, high-throughput communication modules have a critical impact on overall system performance. Apache Storm currently uses Netty as its communication component, an asynchronous server/client framework based on TCP/IP protocol stack. The TCP/IP protocol stack has inherent performance flaws due to frequent memory copying and context switching. The Netty component not only limits the performance of the Storm but also increases the CPU load in the IPoIB (IP over InfiniBand) communication mode. In this paper, we introduce two new implementations for Apache Storm communication components with the help of RDMA technology. The performance evaluation on Mellanox QDR Cards (40 Gbps) shows that our implementations can achieve speedup up to 5\(\times\) compared with IPoIB and 10\(\times\) with Gigabit Ethernet. Our implementations also significantly reduce the CPU load and increase the throughput of the system.



中文翻译:

基于RDMA的Apache Storm用于高性能流数据处理

Apache Storm是一种可伸缩的容错分布式实时流处理框架,广泛用于大数据应用程序。对于分布式数据敏感的应用程序,低延迟,高吞吐量的通信模块对整体系统性能具有至关重要的影响。Apache Storm当前使用Netty作为其通信组件,它是一个基于TCP / IP协议栈的异步服务器/客户端框架。由于频繁的内存复制和上下文切换,TCP / IP协议栈具有固有的性能缺陷。Netty组件不仅限制了Storm的性能,而且还增加了IPoIB(基于InfiniBand的IP)通信模式下的CPU负载。在本文中,我们借助RDMA技术介绍了Apache Storm通信组件的两个新实现。\(\倍\)与IPoIB的10相比\(\倍\)千兆以太网。我们的实现还大大减少了CPU负载并提高了系统的吞吐量。

更新日期:2021-03-19
down
wechat
bug