当前位置: X-MOL 学术Sci. Program. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Containment Domains: A Scalable, Efficient and Flexible Resilience Scheme for Exascale Systems
Scientific Programming Pub Date : 2013 , DOI: 10.3233/spr-130374
Jinsuk Chung, Ikhwan Lee, Michael Sullivan, Jee Ho Ryoo, Dong Wan Kim, Doe Hyun Yoon, Larry Kaplan, Mattan Erez

This paper describes and evaluates a scalable and efficient resilience scheme based on the concept of containment domains. Containment domains are a programming construct that enable applications to express resilience needs and to interact with the system to tune and specialize error detection, state preservation and restoration, and recovery schemes. Containment domains have weak transactional semantics and are nested to take advantage of the machine and application hierarchies and to enable hierarchical state preservation, restoration and recovery. We evaluate the scalability and efficiency of containment domains using generalized trace-driven simulation and analytical analysis and show that containment domains are superior to both checkpoint restart and redundant execution approaches.

中文翻译:

遏制域:用于百亿亿次系统的可扩展,高效且灵活的弹性方案

本文基于收容域的概念描述和评估了一种可扩展且高效的弹性方案。包含域是一种编程构造,使应用程序能够表达弹性需求并与系统交互以调整和专用于错误检测,状态保存和恢复以及恢复方案。包含域具有弱的事务语义,并且被嵌套以利用机器和应用程序层次结构并实现层次结构状态的保存,恢复和恢复。我们使用广义跟踪驱动的仿真和分析分析评估了安全域的可伸缩性和效率,并表明安全域优于检查点重启和冗余执行方法。
更新日期:2020-09-25
down
wechat
bug