当前位置: X-MOL 学术arXiv.cs.OS › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Resilient Virtualized Systems Using ReHype
arXiv - CS - Operating Systems Pub Date : 2021-01-23 , DOI: arxiv-2101.09282
Michael Le, Yuval Tamir

System-level virtualization introduces critical vulnerabilities to failures of the software components that implement virtualization -- the virtualization infrastructure (VI). To mitigate the impact of such failures, we introduce a resilient VI (RVI) that can recover individual VI components from failure, caused by hardware or software faults, transparently to the hosted virtual machines (VMs). Much of the focus is on the ReHype mechanism for recovery from hypervisor failures, that can lead to state corruption and to inconsistencies among the states of system components. ReHype's implementation for the Xen hypervisor was done incrementally, using fault injection results to identify sources of critical corruption and inconsistencies. This implementation involved 900 LOC, with memory space overhead of 2.1MB. Fault injection campaigns, with a variety of fault types, show that ReHype can successfully recover, in less than 750ms, from over 88% of detected hypervisor failures. In addition to ReHype, recovery mechanisms for the other VI components are described. The overall effectiveness of our RVI is evaluated hosting a Web service application, on a cluster of VMs. With faults in any VI component, for over 87% of detected failures, our recovery mechanisms allow services provided by the application to be continuously maintained despite the resulting failures of VI components.

中文翻译:

使用ReHype的弹性虚拟化系统

系统级虚拟化为实现虚拟化的软件组件(虚拟化基础架构(VI))的故障引入了关键漏洞。为了减轻此类故障的影响,我们引入了可恢复的VI(RVI),该组件可从硬件或软件故障引起的故障中恢复单个VI组件,而对托管虚拟机(VM)透明。大多数关注点是从系统管理程序故障中恢复的ReHype机制,该机制可能导致状态损坏以及系统组件状态之间的不一致。ReHype对Xen虚拟机管理程序的实现是逐步完成的,使用故障注入结果来识别严重损坏和不一致的原因。此实现涉及900个LOC,内存空间开销为2.1MB。故障注入活动,对各种故障类型的分析表明,ReHype可以在不到750毫秒的时间内成功地从88%的检测到的管理程序故障中成功恢复。除了ReHype,还介绍了其他VI组件的恢复机制。我们评估了RVI的整体有效性,即在VM群集上托管Web服务应用程序。对于任何VI组件中的故障,对于超过87%的检测到的故障,我们的恢复机制都可以使应用程序提供的服务得以连续维护,即使最终导致VI组件发生故障。在VM群集上。对于任何VI组件中的故障,对于超过87%的检测到的故障,我们的恢复机制都可以使应用程序提供的服务得以连续维护,即使最终导致VI组件发生故障。在VM群集上。对于任何VI组件中的故障,对于超过87%的检测到的故障,我们的恢复机制都可以使应用程序提供的服务得以连续维护,即使最终导致VI组件发生故障。
更新日期:2021-01-26
down
wechat
bug