Rollback Mechanisms for Cloud Management APIs using AI planning,IEEE Transactions on Dependable and Secure Computing

当前位置： X-MOL 学术 › IEEE Trans. Dependable Secure Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Rollback Mechanisms for Cloud Management APIs using AI planning
IEEE Transactions on Dependable and Secure Computing ( IF 7.3 ) Pub Date : 2020-01-01 , DOI: 10.1109/tdsc.2017.2729543
Suhrid Satyal , Ingo Weber , Len Bass , Min Fu

Human-induced faults play a large role in systems reliability. In cloud platforms, system administrators may inadvertently make catastrophic mistakes, like deleting a virtual disk with important data. Providing rollback for cloud operations can reduce the severity and impact of such mistakes, by allowing to revert to a known, good state. However, in the context of cloud management this is non-trivial, since cloud consumers only have limited visibility and indirect control. In this paper, we present a scalable approach to rollback operations that change the state of a system on proprietary cloud platforms. In our previous work, we provided a system that augments cloud APIs and provides rollback operation using an AI planner. In this paper, we build upon our previous work, but parallelize the rollback plan generation based on characteristics unique to rollback scenario. Furthermore, we introduce a distributed anytime algorithm that gradually improves plan quality over time, until either an optimal plan is found or a timeout is reached. Through experimental evaluation we show that our approach scales better than a naïve approach, and effectively avoids the exponential behavior of AI planning. Further, we explore the trade-offs between the quality of rollback plans and plan generation time.

中文翻译：

使用 AI 规划的云管理 API 回滚机制

人为故障在系统可靠性方面起着重要作用。在云平台中，系统管理员可能会在不经意间犯下灾难性的错误，例如删除包含重要数据的虚拟磁盘。通过允许恢复到已知的良好状态，为云操作提供回滚可以降低此类错误的严重性和影响。然而，在云管理的背景下，这并不重要，因为云消费者只有有限的可见性和间接控制。在本文中，我们提出了一种可扩展的方法来更改专有云平台上的系统状态的回滚操作。在我们之前的工作中，我们提供了一个系统来增强云 API 并使用 AI 规划器提供回滚操作。在本文中，我们在之前的工作基础上，但基于回滚场景特有的特性并行化回滚计划的生成。此外，我们引入了一种分布式随时算法，随着时间的推移逐渐提高计划质量，直到找到最佳计划或达到超时。通过实验评估，我们表明我们的方法比简单的方法具有更好的扩展性，并有效地避免了 AI 规划的指数行为。此外，我们探讨了回滚计划的质量和计划生成时间之间的权衡。并有效避免了AI规划的指数行为。此外，我们探讨了回滚计划的质量和计划生成时间之间的权衡。并有效避免了AI规划的指数行为。此外，我们探讨了回滚计划的质量和计划生成时间之间的权衡。

更新日期：2020-01-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>