Machine Learning for Performance Prediction of Spark Cloud Applications,arXiv - CS - Performance

当前位置： X-MOL 学术 › arXiv.cs.PF › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Machine Learning for Performance Prediction of Spark Cloud Applications
arXiv - CS - Performance Pub Date : 2021-08-27 , DOI: arxiv-2108.12214
Alexandre Maros, Fabricio Murai, Ana Paula Couto da Silva, Jussara M. Almeida, Marco Lattuada, Eugenio Gianniti, Marjan Hosseini, Danilo Ardagna

Big data applications and analytics are employed in many sectors for a variety of goals: improving customers satisfaction, predicting market behavior or improving processes in public health. These applications consist of complex software stacks that are often run on cloud systems. Predicting execution times is important for estimating the cost of cloud services and for effectively managing the underlying resources at runtime. Machine Learning (ML), providing black box solutions to model the relationship between application performance and system configuration without requiring in-detail knowledge of the system, has become a popular way of predicting the performance of big data applications. We investigate the cost-benefits of using supervised ML models for predicting the performance of applications on Spark, one of today's most widely used frameworks for big data analysis. We compare our approach with \textit{Ernest} (an ML-based technique proposed in the literature by the Spark inventors) on a range of scenarios, application workloads, and cloud system configurations. Our experiments show that Ernest can accurately estimate the performance of very regular applications, but it fails when applications exhibit more irregular patterns and/or when extrapolating on bigger data set sizes. Results show that our models match or exceed Ernest's performance, sometimes enabling us to reduce the prediction error from 126-187% to only 5-19%.

中文翻译：

用于 Spark Cloud 应用程序性能预测的机器学习

大数据应用程序和分析在许多领域被用于实现各种目标：提高客户满意度、预测市场行为或改进公共卫生流程。这些应用程序由通常在云系统上运行的复杂软件堆栈组成。预测执行时间对于估算云服务的成本和在运行时有效管理底层资源非常重要。机器学习 (ML) 提供黑盒解决方案来模拟应用程序性能和系统配置之间的关系，而无需详细了解系统，已成为预测大数据应用程序性能的流行方法。我们研究了使用监督 ML 模型预测 Spark 上应用程序性能的成本效益，Spark 是当今的应用程序之一最广泛使用的大数据分析框架。我们将我们的方法与 \textit{Ernest}（Spark 发明者在文献中提出的基于 ML 的技术）在一系列场景、应用程序工作负载和云系统配置上进行比较。我们的实验表明，Ernest 可以准确估计非常规则的应用程序的性能，但是当应用程序表现出更不规则的模式和/或推断更大的数据集大小时，它会失败。结果表明，我们的模型达到或超过 Ernest 的表现，有时使我们能够将预测误差从 126-187% 降低到仅 5-19%。我们的实验表明，Ernest 可以准确估计非常规则的应用程序的性能，但是当应用程序表现出更不规则的模式和/或推断更大的数据集大小时，它会失败。结果表明，我们的模型匹配或超过 Ernest 的性能，有时使我们能够将预测误差从 126-187% 减少到仅 5-19%。我们的实验表明，Ernest 可以准确估计非常规则的应用程序的性能，但是当应用程序表现出更不规则的模式和/或推断更大的数据集大小时，它会失败。结果表明，我们的模型匹配或超过 Ernest 的性能，有时使我们能够将预测误差从 126-187% 减少到仅 5-19%。

更新日期：2021-08-30

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>