ContainerStress: Autonomous Cloud-Node Scoping Framework for Big-Data ML Use Cases,arXiv - CS - Distributed, Parallel, and Cluster Computing

当前位置： X-MOL 学术 › arXiv.cs.DC › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

ContainerStress: Autonomous Cloud-Node Scoping Framework for Big-Data ML Use Cases
arXiv - CS - Distributed, Parallel, and Cluster Computing Pub Date : 2020-03-18 , DOI: arxiv-2003.08011
Guang Chao Wang, Kenny Gross, and Akshay Subramaniam

Deploying big-data Machine Learning (ML) services in a cloud environment presents a challenge to the cloud vendor with respect to the cloud container configuration sizing for any given customer use case. OracleLabs has developed an automated framework that uses nested-loop Monte Carlo simulation to autonomously scale any size customer ML use cases across the range of cloud CPU-GPU "Shapes" (configurations of CPUs and/or GPUs in Cloud containers available to end customers). Moreover, the OracleLabs and NVIDIA authors have collaborated on a ML benchmark study which analyzes the compute cost and GPU acceleration of any ML prognostic algorithm and assesses the reduction of compute cost in a cloud container comprising conventional CPUs and NVIDIA GPUs.

中文翻译：

ContainerStress：用于大数据机器学习用例的自治云节点范围框架

在云环境中部署大数据机器学习 (ML) 服务对云供应商提出了一个挑战，即针对任何给定客户用例进行云容器配置大小调整。OracleLabs 开发了一个自动化框架，该框架使用嵌套循环蒙特卡罗模拟在云 CPU-GPU“形状”范围内自动扩展任何规模的客户 ML 用例（最终客户可用的云容器中的 CPU 和/或 GPU 配置） . 此外，OracleLabs 和 NVIDIA 的作者合作进行了一项 ML 基准研究，该研究分析了任何 ML 预测算法的计算成本和 GPU 加速，并评估了由传统 CPU 和 NVIDIA GPU 组成的云容器中计算成本的降低。

更新日期：2020-03-19

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>