当前位置: X-MOL 学术arXiv.cs.DB › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Towards Concurrent Stateful Stream Processing on Multicore Processors (Technical Report)
arXiv - CS - Databases Pub Date : 2019-04-08 , DOI: arxiv-1904.03800
Shuhao Zhang, Yingjun Wu, Feng Zhang, and Bingsheng He

Recent data stream processing systems (DSPSs) can achieve excellent performance when processing large volumes of data under tight latency constraints. However, they sacrifice support for concurrent state access that eases the burden of developing stateful stream applications. Recently, some have proposed managing concurrent state access during stream processing by modeling state accesses as transactions. However, these are realized with locks involving serious contention overhead. Their coarse-grained processing paradigm further magnifies contention issues and tends to poorly utilize modern multicore architectures. This paper introduces TStream , a novel DSPS supporting efficient concurrent state access on multicore processors. Transactional semantics is employed like previous work, but scalability is greatly improved due to two novel designs: 1) dual-mode scheduling, which exposes more parallelism opportunities, 2) dynamic restructuring execution, which aggressively exploits the parallelism opportunities from dual-mode scheduling without centralized lock contentions. To validate our proposal, we evaluate TStream with a benchmark of four applications on a modern multicore machine. The experimental results show that 1) TStream achieves up to 4.8 times higher throughput with similar processing latency compared to the state-of-the-art and 2) unlike prior solutions, TStream is highly tolerant of varying application workloads such as key skewness and multi-partition state accesses.

中文翻译:

在多核处理器上实现并发状态流处理(技术报告)

最近的数据流处理系统 (DSPS) 在严格的延迟限制下处理大量数据时可以实现出色的性能。但是,它们牺牲了对并发状态访问的支持,从而减轻了开发有状态流应用程序的负担。最近,一些人提出通过将状态访问建模为事务来管理流处理期间的并发状态访问。然而,这些是通过涉及严重争用开销的锁来实现的。它们的粗粒度处理范式进一步放大了争用问题,并且往往不能很好地利用现代多核架构。本文介绍了 TStream ,这是一种支持多核处理器上高效并发状态访问的新型 DSPS。像以前的工作一样采用事务语义,但由于两种新颖的设计,可扩展性大大提高:1) 双模式调度,它暴露了更多的并行机会,2) 动态重组执行,它积极地利用双模式调度的并行机会而没有集中锁争用。为了验证我们的提议,我们在现代多核机器上使用四个应用程序的基准来评估 TStream。实验结果表明,1) 与最先进的技术相比,TStream 在处理延迟相似的情况下实现了高达 4.8 倍的吞吐量;2) 与之前的解决方案不同,TStream 高度容忍各种应用程序工作负载,例如密钥偏斜和多线程。 - 分区状态访问。它积极利用双模式调度的并行机会,而没有集中的锁争用。为了验证我们的提议,我们在现代多核机器上使用四个应用程序的基准来评估 TStream。实验结果表明,1) 与最先进的技术相比,TStream 在处理延迟相似的情况下实现了高达 4.8 倍的吞吐量;2) 与之前的解决方案不同,TStream 高度容忍各种应用程序工作负载,例如密钥偏斜和多线程。 - 分区状态访问。它积极利用双模式调度的并行机会,而没有集中的锁争用。为了验证我们的提议,我们在现代多核机器上使用四个应用程序的基准来评估 TStream。实验结果表明,1) 与最先进的技术相比,TStream 在处理延迟相似的情况下实现了高达 4.8 倍的吞吐量;2) 与之前的解决方案不同,TStream 高度容忍各种应用程序工作负载,例如密钥偏斜和多线程。 - 分区状态访问。
更新日期:2020-01-17
down
wechat
bug