当前位置: X-MOL 学术arXiv.cs.PF › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
NumaPerf: Predictive and Full NUMA Profiling
arXiv - CS - Performance Pub Date : 2021-02-10 , DOI: arxiv-2102.05204
Xin ZhaoUniversity of Massachusetts Amherst, Jin ZhouUniversity of Massachusetts Amherst, Hui GuanUniversity of Massachusetts Amherst, Wei WangUniversity of Texas at San Antonio, Xu LiuNorth Carolina State University, Tongping LiuUniversity of Massachusetts Amherst

Parallel applications are extremely challenging to achieve the optimal performance on the NUMA architecture, which necessitates the assistance of profiling tools. However, existing NUMA-profiling tools share some similar shortcomings, such as portability, effectiveness, and helpfulness issues. This paper proposes a novel profiling tool - NumaPerf - that overcomes these issues. NumaPerf aims to identify potential performance issues for any NUMA architecture, instead of only on the current hardware. To achieve this, NumaPerf focuses on memory sharing patterns between threads, instead of real remote accesses. NumaPerf further detects potential thread migrations and load imbalance issues that could significantly affect the performance but are omitted by existing profilers. NumaPerf also separates cache coherence issues that may require different fix strategies. Based on our extensive evaluation, NumaPerf is able to identify more performance issues than any existing tool, while fixing these bugs leads to up to 5.94x performance speedup.

中文翻译:

NumaPerf:预测和完整的NUMA分析

要在NUMA架构上实现最佳性能,并行应用程序极具挑战性,这需要使用分析工具进行协助。但是,现有的NUMA分析工具也存在一些类似的缺点,例如可移植性,有效性和有用性问题。本文提出了一种新颖的配置工具-NumaPerf-克服了这些问题。NumaPerf旨在识别任何NUMA体系结构的潜在性能问题,而不仅仅是在当前硬件上。为了实现这一目标,NumaPerf专注于线程之间的内存共享模式,而不是真正的远程访问。NumaPerf进一步检测潜在的线程迁移和负载不平衡问题,这些问题可能会严重影响性能,但现有的探查器已将其忽略。NumaPerf还分离了可能需要不同修复策略的缓存一致性问题。根据我们的广泛评估,NumaPerf能够比任何现有工具识别更多的性能问题,同时修复这些错误可以使性能加速高达5.94倍。
更新日期:2021-02-11
down
wechat
bug