You Only Run Once: Spark Auto-tuning from a Single Run,IEEE Transactions on Network and Service Management

当前位置： X-MOL 学术 › IEEE Trans. Netw. Serv. Manag. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

You Only Run Once: Spark Auto-tuning from a Single Run
IEEE Transactions on Network and Service Management ( IF 4.7 ) Pub Date : 2020-12-01 , DOI: 10.1109/tnsm.2020.3034824
David Buchaca Prats , Felipe Albuquerque Portella , Carlos H. A. Costa , Josep Lluis Berral

Tuning configurations of Spark jobs is not a trivial task. State-of-the-art auto-tuning systems are based on iteratively running workloads with different configurations. During the optimization process, the relevant features are explored to find good solutions. Many optimizers enhance the time-to-solution using black-box optimization algorithms that do not take into account any information from the Spark workloads. In this article, we present a new method for tuning configurations that uses information from one run of a Spark workload. To achieve good performance, we mine the SparkEventLog that is generated by the Spark engine. This log file contains a large amount of information from the executed application. We use this information to enhance a performance model with low-level features from the workload to be optimized. These features include Spark Actions, Transformations, and Task metrics. This process allows us to obtain application-specific workload information. With this information our system can predict sensible Spark configurations for unseen jobs, given that it has been trained with reasonable coverage of Spark applications. Experiments show that the presented system correctly produces good configurations, while achieving up to 80% speedup with respect to the default Spark configuration, and up to 12x speedup of the time-to-solution with respect to a standard Bayesian Optimization procedure.

中文翻译：

你只运行一次：从一次运行中激发自动调整

调整 Spark 作业的配置并非易事。最先进的自动调整系统基于具有不同配置的迭代运行工作负载。在优化过程中，探索相关特征以找到好的解决方案。许多优化器使用不考虑来自 Spark 工作负载的任何信息的黑盒优化算法来提高解决方案的时间。在本文中，我们提出了一种使用 Spark 工作负载运行中的信息来调整配置的新方法。为了获得良好的性能，我们挖掘了 Spark 引擎生成的 SparkEventLog。此日志文件包含来自已执行应用程序的大量信息。我们使用此信息来增强具有要优化的工作负载的低级功能的性能模型。这些功能包括 Spark 操作、转换和任务指标。此过程允许我们获取特定于应用程序的工作负载信息。有了这些信息，我们的系统可以为看不见的作业预测合理的 Spark 配置，因为它已经在 Spark 应用程序的合理覆盖范围内进行了训练。实验表明，所提出的系统正确地产生了良好的配置，同时相对于默认 Spark 配置实现了高达 80% 的加速，并且相对于标准贝叶斯优化程序，求解时间的加速高达 12 倍。鉴于它已经接受了合理覆盖 Spark 应用程序的培训。实验表明，所提出的系统正确地产生了良好的配置，同时相对于默认 Spark 配置实现了高达 80% 的加速，并且相对于标准贝叶斯优化程序，求解时间的加速高达 12 倍。鉴于它已经接受了对 Spark 应用程序的合理覆盖的培训。实验表明，所提出的系统正确地产生了良好的配置，同时相对于默认 Spark 配置实现了高达 80% 的加速，并且相对于标准贝叶斯优化程序，求解时间的加速高达 12 倍。

更新日期：2020-12-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11