Serverless Model Serving for Data Science,arXiv - CS - Databases

当前位置： X-MOL 学术 › arXiv.cs.DB › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Serverless Model Serving for Data Science
arXiv - CS - Databases Pub Date : 2021-03-04 , DOI: arxiv-2103.02958
Yuncheng Wu, Tien Tuan Anh Dinh, Guoyu Hu, Meihui Zhang, Yeow Meng Chee, Beng Chin Ooi

Machine learning (ML) is an important part of modern data science applications. Data scientists today have to manage the end-to-end ML life cycle that includes both model training and model serving, the latter of which is essential, as it makes their works available to end-users. Systems for model serving require high performance, low cost, and ease of management. Cloud providers are already offering model serving options, including managed services and self-rented servers. Recently, serverless computing, whose advantages include high elasticity and fine-grained cost model, brings another possibility for model serving. In this paper, we study the viability of serverless as a mainstream model serving platform for data science applications. We conduct a comprehensive evaluation of the performance and cost of serverless against other model serving systems on two clouds: Amazon Web Service (AWS) and Google Cloud Platform (GCP). We find that serverless outperforms many cloud-based alternatives with respect to cost and performance. More interestingly, under some circumstances, it can even outperform GPU-based systems for both average latency and cost. These results are different from previous works' claim that serverless is not suitable for model serving, and are contrary to the conventional wisdom that GPU-based systems are better for ML workloads than CPU-based systems. Other findings include a large gap in cold start time between AWS and GCP serverless functions, and serverless' low sensitivity to changes in workloads or models. Our evaluation results indicate that serverless is a viable option for model serving. Finally, we present several practical recommendations for data scientists on how to use serverless for scalable and cost-effective model serving.

中文翻译：

数据科学的无服务器模型服务

机器学习（ML）是现代数据科学应用程序的重要组成部分。如今，数据科学家必须管理包括模型训练和模型服务在内的端到端ML生命周期，后者是必不可少的，因为它使最终用户可以使用他们的作品。用于模型服务的系统需要高性能，低成本和易于管理。云提供商已经在提供模型服务选项，包括托管服务和自租服务器。最近，无服务器计算的优点包括高弹性和细粒度的成本模型，这为模型服务带来了另一种可能性。在本文中，我们研究了无服务器作为数据科学应用程序主流模型服务平台的可行性。我们针对两个云上的其他模型服务系统，对无服务器的性能和成本进行了全面的评估：Amazon Web Service（AWS）和Google Cloud Platform（GCP）。我们发现，就成本和性能而言，无服务器性能优于许多基于云的替代方案。更有趣的是，在某些情况下，它的平均延迟和成本甚至可以超过基于GPU的系统。这些结果与以前的说法不同，即无服务器不适用于模型服务，并且与传统的观点相反，即基于GPU的系统比基于CPU的系统更适合ML工作负载，这与传统观点相反。其他发现包括AWS和GCP无服务器功能之间的冷启动时间有很大的差距，以及无服务器对工作负载或模型变化的敏感性较低。我们的评估结果表明，无服务器是模型服务的可行选择。最后，我们为数据科学家提供了一些实用的建议，说明如何使用无服务器进行可伸缩且具有成本效益的模型服务。

更新日期：2021-03-05

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>