当前位置: X-MOL 学术ACM Trans. Comput. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Lock–Unlock
ACM Transactions on Computer Systems ( IF 1.5 ) Pub Date : 2019-03-14 , DOI: 10.1145/3301501
Rachid Guerraoui 1 , Hugo Guiroux 2 , Renaud Lachaize 2 , Vivien Quéma 2 , Vasileios Trigonakis 3
Affiliation  

A plethora of optimized mutex lock algorithms have been designed over the past 25 years to mitigate performance bottlenecks related to critical sections and locks. Unfortunately, there is currently no broad study of the behavior of these optimized lock algorithms on realistic applications that consider different performance metrics, such as energy efficiency and tail latency. In this article, we perform a thorough and practical analysis of synchronization, with the goal of providing software developers with enough information to design fast, scalable, and energy-efficient synchronization in their systems. First, we perform a performance study of 28 state-of-the-art mutex lock algorithms, on 40 applications, on four different multicore machines. We consider not only throughput (traditionally the main performance metric) but also energy efficiency and tail latency, which are becoming increasingly important. Second, we present an in-depth analysis in which we summarize our findings for all the studied applications. In particular, we describe nine different lock-related performance bottlenecks, and we propose six guidelines helping software developers with their choice of a lock algorithm according to the different lock properties and the application characteristics. From our detailed analysis, we make several observations regarding locking algorithms and application behaviors, several of which have not been previously discovered: (i) applications stress not only the lock–unlock interface but also the full locking API (e.g., trylocks, condition variables); (ii) the memory footprint of a lock can directly affect the application performance; (iii) for many applications, the interaction between locks and scheduling is an important application performance factor; (vi) lock tail latencies may or may not affect application tail latency; (v) no single lock is systematically the best; (vi) choosing the best lock is difficult; and (vii) energy efficiency and throughput go hand in hand in the context of lock algorithms. These findings highlight that locking involves more considerations than the simple lock/unlock interface and call for further research on designing low-memory footprint adaptive locks that fully and efficiently support the full lock interface, and consider all performance metrics.

中文翻译:

锁定-解锁

在过去的 25 年中,已经设计了大量优化的互斥锁算法来缓解与关键部分和锁相关的性能瓶颈。不幸的是,目前还没有广泛研究这些优化锁定算法在考虑不同性能指标(例如能源效率和尾部延迟)的实际应用程序中的行为。在本文中,我们对同步进行了全面而实用的分析,目的是为软件开发人员提供足够的信息,以便在他们的系统中设计快速、可扩展和节能的同步。首先,我们在 40 个应用程序和 4 台不同的多核机器上对 28 种最先进的互斥锁算法进行了性能研究。我们不仅考虑吞吐量(传统上是主要的性能指标),还考虑能源效率和尾部延迟,这些都变得越来越重要。其次,我们进行了深入的分析,总结了我们对所有研究应用的发现。特别是,我们描述了九个不同的与锁相关的性能瓶颈,我们提出了六个指导方针,帮助软件开发人员根据不同的锁属性和应用程序特性选择锁算法。从我们的详细分析中,我们对锁定算法和应用程序行为进行了一些观察,其中一些是以前没有发现的:(i) 应用程序不仅强调锁定-解锁接口,而且还强调完整的锁定 API(例如,trylocks、条件变量); (ii) 锁的内存占用会直接影响应用程序的性能;(iii) 对于许多应用程序来说,锁和调度之间的交互是一个重要的应用程序性能因素;(vi) 锁尾延迟可能会或可能不会影响应用程序尾延迟;(v) 没有单一的锁在系统上是最好的;(vi) 选择最好的锁是困难的;(vii) 能源效率和吞吐量在锁定算法的背景下齐头并进。这些发现强调了锁定比简单的锁定/解锁接口涉及更多的考虑,并呼吁进一步研究设计完全有效地支持完整锁定接口的低内存占用自适应锁,并考虑所有性能指标。锁和调度之间的交互是一个重要的应用程序性能因素;(vi) 锁尾延迟可能会或可能不会影响应用程序尾延迟;(v) 没有单一的锁在系统上是最好的;(vi) 选择最好的锁是困难的;(vii) 能源效率和吞吐量在锁定算法的背景下齐头并进。这些发现强调了锁定比简单的锁定/解锁接口涉及更多的考虑,并呼吁进一步研究设计完全有效地支持完整锁定接口的低内存占用自适应锁,并考虑所有性能指标。锁和调度之间的交互是一个重要的应用程序性能因素;(vi) 锁尾延迟可能会或可能不会影响应用程序尾延迟;(v) 没有单一的锁在系统上是最好的;(vi) 选择最好的锁是困难的;(vii) 能源效率和吞吐量在锁定算法的背景下齐头并进。这些发现强调了锁定比简单的锁定/解锁接口涉及更多的考虑,并呼吁进一步研究设计完全有效地支持完整锁定接口的低内存占用自适应锁,并考虑所有性能指标。(vii) 能源效率和吞吐量在锁定算法的背景下齐头并进。这些发现强调了锁定比简单的锁定/解锁接口涉及更多的考虑,并呼吁进一步研究设计完全有效地支持完整锁定接口的低内存占用自适应锁,并考虑所有性能指标。(vii) 能源效率和吞吐量在锁定算法的背景下齐头并进。这些发现强调了锁定比简单的锁定/解锁接口涉及更多的考虑,并呼吁进一步研究设计完全有效地支持完整锁定接口的低内存占用自适应锁,并考虑所有性能指标。
更新日期:2019-03-14
down
wechat
bug