当前位置: X-MOL 学术Computing › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Tutorial on systems with antifragility to downtime
Computing ( IF 3.3 ) Pub Date : 2021-01-07 , DOI: 10.1007/s00607-020-00895-6
Kjell Jørgen Hole

An antifragile system of software and stakeholders, including designers, developers, and operators, learn from incidents how to avoid outages and maintain high uptime. This tutorial article reviews how to design and operate such socio-technical systems with antifragility to downtime. It documents the importance of four design principles and two operational principles by exploring the polar opposite anti-principles and the interplay between the principles and the anti-principles. The design principles mandate a software design of separate and isolatable processes with sufficient diversity and redundancy. The processes should communicate asynchronously over an external network. The operational principles imply that the software development teams should repeatedly inject artificial failures into the production system to understand its behavior and detect and mitigate vulnerabilities as the system and its environment change.

中文翻译:

具有抗宕机性的系统教程

软件和利益相关者(包括设计人员、开发人员和操作员)的反脆弱系统从事件中学习如何避免中断并保持高正常运行时间。本教程文章回顾了如何设计和操作这种具有抗宕机能力的社会技术系统。它通过探索对立的反原则以及原则与反原则之间的相互作用,记录了四个设计原则和两个操作原则的重要性。设计原则要求软件设计具有足够的多样性和冗余性的独立和可隔离的过程。进程应该通过外部网络异步通信。
更新日期:2021-01-07
down
wechat
bug