Scalene: Scripting-Language Aware Profiling for Python,arXiv - CS - Programming Languages

当前位置： X-MOL 学术 › arXiv.cs.PL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Scalene: Scripting-Language Aware Profiling for Python
arXiv - CS - Programming Languages Pub Date : 2020-06-06 , DOI: arxiv-2006.03879
Emery D. Berger

Existing profilers for scripting languages (a.k.a. "glue" languages) like Python suffer from numerous problems that drastically limit their usefulness. They impose order-of-magnitude overheads, report information at too coarse a granularity, or fail in the face of threads. Worse, past profilers---essentially variants of their counterparts for C---are oblivious to the fact that optimizing code in scripting languages requires information about code spanning the divide between the scripting language and libraries written in compiled languages. This paper introduces scripting-language aware profiling, and presents Scalene, an implementation of scripting-language aware profiling for Python. Scalene employs a combination of sampling, inference, and disassembly of byte-codes to efficiently and precisely attribute execution time and memory usage to either Python, which developers can optimize, or library code, which they cannot. It includes a novel sampling memory allocator that reports line-level memory consumption and trends with low overhead, helping developers reduce footprints and identify leaks. Finally, it introduces a new metric, copy volume, to help developers root out insidious copying costs across the Python/library boundary, which can drastically degrade performance. Scalene works for single or multi-threaded Python code, is precise, reporting detailed information at the line granularity, while imposing modest overheads (26%--53%).

中文翻译：

Scalene：Python 的脚本语言感知分析

现有的脚本语言（又名“胶水”语言）（如 Python）的分析器存在许多问题，极大地限制了它们的实用性。它们强加了数量级的开销，以过于粗糙的粒度报告信息，或者在线程面前失败。更糟糕的是，过去的分析器——本质上是 C 语言对应物的变体——忽略了这样一个事实，即优化脚本语言中的代码需要有关跨越脚本语言和用编译语言编写的库之间鸿沟的代码的信息。本文介绍了脚本语言感知分析，并介绍了 Scalene，这是 Python 脚本语言感知分析的实现。Scalene 结合了采样、推理、以及字节码的反汇编，以高效准确地将执行时间和内存使用归因于开发人员可以优化的 Python 或他们不能优化的库代码。它包括一个新颖的采样内存分配器，以低开销报告行级内存消耗和趋势，帮助开发人员减少占用空间并识别泄漏。最后，它引入了一个新指标，即复制量，以帮助开发人员消除跨 Python/库边界的潜在复制成本，这会大大降低性能。Scalene 适用于单线程或多线程 Python 代码，精确，以行粒度报告详细信息，同时施加适度的开销 (26%--53%)。它包括一个新颖的采样内存分配器，以低开销报告行级内存消耗和趋势，帮助开发人员减少占用空间并识别泄漏。最后，它引入了一个新指标，即复制量，以帮助开发人员消除跨 Python/库边界的潜在复制成本，这会大大降低性能。Scalene 适用于单线程或多线程 Python 代码，精确，以行粒度报告详细信息，同时施加适度的开销 (26%--53%)。它包括一个新颖的采样内存分配器，以低开销报告行级内存消耗和趋势，帮助开发人员减少占用空间并识别泄漏。最后，它引入了一个新指标，即复制量，以帮助开发人员消除跨 Python/库边界的潜在复制成本，这会大大降低性能。Scalene 适用于单线程或多线程 Python 代码，精确，以行粒度报告详细信息，同时施加适度的开销 (26%--53%)。

更新日期：2020-07-28

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>