Abstract
Self-healing, a prominent property of self-adaptiveness provides reliability, availability, maintainability, and survivability to a software system. These qualitative factors are very salient to modern distributed systems in which components and their collaboration often vary. Survivability of such systems can be best addressed from an architectural viewpoint. When it comes to maintainability and reliability, architectural level adaptation is not often supported during the design phase. Adaptation to fault tolerance into the design phase of the system development process can increase the scope of software availability and thereby attaining self-healing. In distributed systems, most of the existing architectures are often associated with communication and correspondence as primary criteria. On the other hand, a multi-agent mechanism helps in schematic control of functionality, communication by emphasizing scalability. In this paper, a novel architecture was proposed that could support agent-based distributed systems to address fault recovery aspects for achieving self-adaptiveness. Unlike traditional multi-agent architecture, task-oriented functional multi-agent communication is incorporated for various activities during design phase designated to perform self-healing criteria. An adaptation of agent communication control flow is proposed using three novel mechanism such as planning, functioning and enacting as agents’ critical responsibility. The paper also validates the proposed architecture for resource and availability based faults related to crash and resource unavailability using performance-based evaluation metrics. A case-based application with single thread connectivity is used to reflect the architecture during application design phase and is tested for success using mean response time as evaluation metric.
Similar content being viewed by others
References
Andersson J, De Lemos R, Malek S, Weyns D (2009) Modeling dimensions of self-adaptive software systems. In: Software engineering for self-adaptive systems. Springer, Berlin, pp 27–47
Arlat J, Costes A, Crouzet Y, Laprie JC, Powell D (1993) Fault injection and dependability evaluation of fault-tolerant systems. IEEE Trans Comput 42(8):913–923
Azaiez M, Chainbi W (2016) A multi-agent system architecture for self-healing cloud infrastructure. In: Proceedings of the international conference on internet of things and cloud computing. ACM, New York, pp 1–6. Article no. 7
Azim MT, Neamtiu I, Marvel LM (2014) Towards self-healing smartphone software via automated patching. In: Proceedings of the 29th ACM/IEEE international conference on automated software engineering. ACM, New York, pp 623–628
Babaoglu O, Jelasity M, Montresor A, Fetzer C, Leonardi S, van Moorsel A, van Steen M (eds) (2005) Self-star properties in complex information systems: conceptual and practical foundations. Conceptual and practical foundations. Springer, Berlin, p 3460
Baker M, Sullivan M (1992) The recovery box: using fast recovery to provide high availability in the UNIX environment. In: USENIX summer 1992 Technical Conference, San Antonio
Breitgand D, Goldstein M, Henis E, Shehory O, Weinsberg Y (2007) Panacea towards a self-healing development framework. In: 10th IFIP/IEEE international symposium on integrated network management, pp 169–178
Brooks FP Jr (1995) The mythical man-month: essays on software engineering, anniversary edition, 2nd edn. Pearson Education, New Delhi
Chainbi W (2005) Why applying agent technology to autonomic computing? Front Artif Intell Appl 135:282
Cheng B, de Lemos R, Giese H, Inverardi P, Magee J, Malek RM, Müller H, Park S, Shaw M, Tichy M (2008) Software engineering for self-adaptive systems: a research road map. In: Dagstuhl seminar proceedings 08031, Schloss Dagstuhl-Leibniz-Zentrum für Informatik
Dai W, Riliskis L, Wang P, Vyatkin V, Guan X (2018) A cloud-based decision support system for self-healing in distributed automation systems using fault tree analysis. IEEE Trans Ind Inf 14(3):989–1000
Dashofy EM, Van der Hoek A, Taylor RN (2002) Towards architecture-based self-healing systems. In: Proceedings of the first workshop on self-healing systems. ACM, New York, pp 21–26
De Lemos R, Giese H, Müller HA, Shaw M, Andersson J, Litoiu M, Schmerl B, Tamura G, Villegas NM, Vogel T (2013) Software engineering for self-adaptive systems: a second research roadmap. In: Software engineering for self-adaptive systems II. Springer, Berlin, pp 1–32
Elnozahy EN, Alvisi L, Wang YM, Johnson DB (2002) A survey of rollback-recovery protocols in message-passing systems. ACM Comput Surv (CSUR) 34(3):375–408
Essa YM, El-Mahalawy A, Attiya G, El-Sayed A (2017) A distributed multi-agents architecture for self healing healthcare data center. In: 4th IEEE international conference on engineering technologies and applied sciences (ICETAS). IEEE, New York, pp 1–6
Feyzi F (2020) Model-driven development of self-adaptive multi-agent systems with context-awareness. Int J Comput Aided Eng Technol 12(2):131–156
Ganek AG, Corbi TA (2003) The dawning of the autonomic computing era. IBM Syst J 42(1):5–18
Garlan D, Cheng SW, Huang AC, Schmerl B, Steenkiste P (2004) Rainbow: architecture-based self-adaptation with reusable infrastructure. Computer 37(10):46–54
Ghosh D, Sharman R, Rao HR, Upadhyaya S (2007) Self-healing systems—survey and synthesis. Decis Support Syst 42(4):2164–2185
Goldstein M, Shehory O, Weinsberg Y (2007) Can self-healing software cope with loitering? In: Fourth international workshop on software quality assurance: in conjunction with the 6th ESEC/FSE joint meeting. ACM, New York, pp 1–8
Golpayegani F (2015) Multi-agent collaboration in distributed self-adaptive systems. In: 2015 IEEE international conference on self-adaptive and self-organizing systems workshops. IEEE, New York, pp 146–151
Gray J (1999) What next? A dozen remaining IT problems. Turing award lecture
Hennessy J (1999) The future of systems research. Computer 32(8):27–33
Jennings NR (2000) On agent-based software engineering. Artif Intell 117(2):277–296
Jennings NR, Wooldridge M (2000) Agent-oriented software engineering [Handbook of agent technology]. AAAI/MIT Press, Cambridge
Kamdar R, Paliwal P, Kumar Y (2018) A state of art review on various aspects of multi-agent system. J Circuits Syst Comput 27(11):1830006
Kephart JO, Chess DM (2003) The vision of autonomic computing. Computer 36(1):41–50
Laddaga R, Robertson P, Shrobe H (2001) Introduction to self-adaptive software: applications. In: International workshop on self-adaptive software. Springer, Berlin, pp 1–5
Lampson B (1999) Computer systems research-past and future, keynote address, 17th SOSP
Lee S, Oh J, Lee E (2005) An architecture for multi-agent based self-adaptive system in mobile environment. In: International conference on intelligent data engineering and automated learning. Springer, Berlin, pp 494–500
Magalhães JP, Silva LM (2015) SHõWA: a self-healing framework for web-based applications. ACM Trans Auton Adapt Syst 10(1):4
Merideth MG (2003) Enhancing survivability with proactive fault-containment. In: DSN student forum, Citeseer 20
Merideth MG, Narasimhan P (2003) Proactive containment of malice in survivable distributed systems. In: Security and management, pp 3–9
Montani S, Anglano C (2008) Achieving self-healing in service delivery software systems by means of case-based reasoning. Appl Intell 28(2):139–152
Patterson D, Brown A, Broadwell P, Candea G, Chen M, Cutler J, Enriquez P, Fox A, Kiciman E, Merzbacher M, Oppenheimer D (2002) Recovery-oriented computing (ROC): motivation, definition, techniques, and case studies. Technical Report UCB//CSD-02-1175, UC Berkeley Computer Science
Rajput PK, Sikka G (2019) Exploration in adaptiveness to achieve automated fault recovery in self-healing software systems: a review. Intell Decis Technol 13(3):329–341
Ravulakollu KK, Khan MA, Abraham A (2016) Trends in ambient intelligent systems. Springer, Cham
Ribeiro L, Barata J, Mendes P (2008) MAS and SOA: complementary automation paradigms. In: International conference on information technology for balanced automation systems. Springer, Boston, pp 259–268
Robertson P, Laddaga R, Shrobe H (2009) Introduction: the first international workshop on self-adaptive software. In: International workshop on self-adaptive software. Springer, Berlin, pp 1–10
Salehie M, Tahvildari L (2009) Self-adaptive software: landscape and research challenges. ACM Trans Auton Adapt Syst (TAAS) 4(2):1–42
Simon HA (1996) The sciences of the artificial. MIT Press, Cambridge
Sinha J, Kant S, Ravulakollu KK (2019) Significance of intelligent agents in strengthening consumer relationship management. Int J Eng Res Technol 12(3):364–372
Stipancic T, Jerbic B, Curkovic P (2016) A context-aware approach in realization of socially intelligent industrial robots. Robot Comput Integr Manuf 37:79–89
Strang T, Linnhoff-Popien C (2004) A context modeling survey. In: Workshop on advanced context modelling, reasoning and management, UbiComp, vol 4, pp 34–41
Wang L, Li Q (2016) A multi-agent based framework for self-adaptive software with search-based optimization. In: 2016 IEEE international conference on software maintenance and evolution (ICSME). IEEE, New York, pp 621–625
Wooldridge M (1997) Agent-based software engineering. IEE Proc Softw 144(1):26–37
Acknowledgements
Authors would like to thank Dr. Krian Kumar Ravulakollu, Senior Member International Neural Network Society for his direction and suggestions as advisory for the experimental strategy and validation.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Rajput, P.K., Sikka, G. Multi-agent architecture for fault recovery in self-healing systems. J Ambient Intell Human Comput 12, 2849–2866 (2021). https://doi.org/10.1007/s12652-020-02443-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-02443-8