当前位置: X-MOL 学术arXiv.cs.DB › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Towards Observability for Machine Learning Pipelines
arXiv - CS - Databases Pub Date : 2021-08-31 , DOI: arxiv-2108.13557
Shreya Shankar, Aditya Parameswaran

Software organizations are increasingly incorporating machine learning (ML) into their product offerings, driving a need for new data management tools. Many of these tools facilitate the initial development and deployment of ML applications, contributing to a crowded landscape of disconnected solutions targeted at different stages, or components, of the ML lifecycle. A lack of end-to-end ML pipeline visibility makes it hard to address any issues that may arise after a production deployment, such as unexpected output values or lower-quality predictions. In this paper, we propose a system that wraps around existing tools in the ML development stack and offers end-to-end observability. We introduce our prototype and our vision for mltrace, a platform-agnostic system that provides observability to ML practitioners by (1) executing predefined tests and monitoring ML-specific metrics at component runtime, (2) tracking end-to-end data flow, and (3) allowing users to ask arbitrary post-hoc questions about pipeline health.

中文翻译:

实现机器学习管道的可观察性

软件组织越来越多地将机器学习 (ML) 整合到他们的产品中,从而推动了对新数据管理工具的需求。其中许多工具促进了 ML 应用程序的初始开发和部署,从而导致针对 ML 生命周期的不同阶段或组件的断开连接的解决方案拥挤不堪。缺乏端到端的 ML 管道可见性使得难以解决生产部署后可能出现的任何问题,例如意外的输出值或低质量的预测。在本文中,我们提出了一个系统,该系统围绕 ML 开发堆栈中的现有工具并提供端到端的可观察性。我们介绍我们的原型和我们对 mltrace 的愿景,
更新日期:2021-09-01
down
wechat
bug