Skip to main content
Log in

Peripheral Diagnosis for Propagated Network Faults

  • Published:
Journal of Network and Systems Management Aims and scope Submit manuscript

Abstract

Failures are unavoidable in communication networks, so their detection and identification are vital for the reliable operation of the networks. The existing fault diagnosis techniques are based on many paradigms derived from different areas (e.g., mathematical theories, machine learning, statistical analysis) and with different purposes, such as, obtaining a representation model of the network for fault localization, selecting optimal probe sets for monitoring network devices, reducing fault detection time, and detection of faulty components in the network. Nevertheless, there are still challenges to be faced because those techniques are invasive on account of they increase network traffic and the control overhead. Also, they intensify the internal processes of the network through expanding management processes or monitoring agents on almost all networking devices. This paper introduces a non-invasive fault detection approach based on the observation of symptoms of internal network failures in gateway routers (called peripheral elements). We developed a link failure induction experiment in an emulated network that evidenced the existence of the fault propagation phenomenon to a peripheral level, which demonstrates the feasibility of our approach. Our results foster the use of learning techniques which do not require a complete dependency model of the network and could continuously diagnose the failure symptoms while being resilient to the dynamic changes of the network.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Dusia, A., Sethi, A.S.: Recent advances in fault localization in computer networks. IEEE Commun. Surv. Tutor. 18(4), 3030–3051 (2016). https://doi.org/10.1109/COMST.2016.2570599

    Article  Google Scholar 

  2. Steinder, M., Sethi, A.S.: A survey of fault localization techniques in computer networks. Sci. Comput. Program. 53(2), 165–194 (2004). https://doi.org/10.1016/j.scico.2004.01.010

    Article  MathSciNet  MATH  Google Scholar 

  3. Kitchenham, B., Charters, S.: Guidelines for Performing Systematic Literature Reviews in Software Engineering. Elsevier, Amsterdam (2017)

    Google Scholar 

  4. Steinder, M., Sethi, A.S.: A survey of fault localization techniques in computer networks. Sci. Comput. Program. 53(2), 165–194 (2004). https://doi.org/10.1016/j.scico.2004.01.010

    Article  MathSciNet  MATH  Google Scholar 

  5. Dusia, A., Sethi, A.S.: Recent advances in fault localization in computer networks. IEEE Commun. Surv. Tutor. 18(4), 3030–3051 (2016). https://doi.org/10.1109/COMST.2016.2570599

    Article  Google Scholar 

  6. Yu, L., Cheng, L., Qiao, Y., Yuan, Y., Chen, X.: An efficient active probing approach based on the combination of online and offline strategies. In: 2010 International Conference on Network and Service Management,  2010, pp. 298–301.  https://doi.org/10.1109/CNSM.2010.5691213

  7. Lu, L., Xu, Z., Wang, W., Sun, Y.: A new fault detection method for computer networks. Reliab. Eng. Syst. Saf. 114, 45–51 (2013). https://doi.org/10.1016/j.ress.2012.12.015

    Article  Google Scholar 

  8. Gillani, S.F., Demirci, M., Al-Shaer, E., Ammar, M.H.: Problem localization and quantification using formal evidential reasoning for virtual networks. IEEE Trans. Netw. Serv. Manag. 11(3), 307–320 (2014). https://doi.org/10.1109/TNSM.2014.2326297

    Article  Google Scholar 

  9. Yan, C., Wang, Y., Qiu, X., Li, W., Guan, L.: Multi-layer fault diagnosis method in the Network Virtualization Environment. In The 16th Asia-Pacific Network Operations and Management Symposium, Sep. 2014, pp. 1–6.  https://doi.org/10.1109/APNOMS.2014.6996580

  10. Wang, H., Wang, Y., Qiu, X., Li, W., Xiao, A.: Fault diagnosis based on evidences screening in virtual network. In: 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM), May 2015, pp. 802–805.  https://doi.org/10.1109/INM.2015.7140380

  11. Steinert, R., Gillblad, D.: Long-Term Adaptation and Distributed Detection of Local Network Changes. In: 2010 IEEE Global Telecommunications Conference GLOBECOM 2010, Dec 2010, pp. 1–5.  https://doi.org/10.1109/GLOCOM.2010.5684137

  12. Prieto, A.G., Gillblad, D., Steinert, R., Miron, A.: Toward decentralized probabilistic management. IEEE Commun. Mag. 49(7), 80–86 (2011). https://doi.org/10.1109/MCOM.2011.5936159

    Article  Google Scholar 

  13. Mahimkar, A.A., Ge, Z., Shaikh, A., Wang, J., Yates, J., Zhang, Y., Zhao, Q.: Towards automated performance diagnosis in a large IPTV network. In: Proceedings of the ACM SIGCOMM 2009 Conference on Data Communication, New York, NY, USA, 2009, pp. 231–242.  https://doi.org/10.1145/1592568.1592596

  14. Kavulya, S.P., Daniels, S., Joshi, K., Hiltunen, M., Gandhi, R., Narasimhan, P.: Draco: statistical diagnosis of chronic problems in large distributed systems. In: Proceedings of the: 2012 42Nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Washington, DC, USA, 2012, pp. 1–12.  http://dl.acm.org/citation.cfm?id=2354410.2355155

  15. Johnsson, A., Meirosu, C.: Towards automatic network fault localization in real time using probabilistic inference. In: 2013 IFIP/IEEE International Symposium on Integrated Network Management (IM 2013), May 2013, pp. 1393–1398

  16. Johnsson, A., Meirosu, C., Flinta, C.: Online network performance degradation localization using probabilistic inference and change detection. In: 2014 IEEE Network Operations and Management Symposium (NOMS), May 2014, pp. 1–8.  https://doi.org/10.1109/NOMS.2014.6838255

  17. Wang, B., Ying, S., Cheng, G., Wang, R., Yang, Z., Dong, B.: Log-based anomaly detection with the improved K-nearest neighbor. Int. J. Soft. Eng. Knowl. Eng. 30(02), 239–262 (2020). https://doi.org/10.1142/S0218194020500114

    Article  Google Scholar 

  18. Pal, A., Kumar, M.: DLME: distributed log mining using ensemble learning for fault prediction. IEEE Syst. J. 13(4), 3639–3650 (2019). https://doi.org/10.1109/JSYST.2019.2904513

    Article  Google Scholar 

  19. Gill, P., Jain, N., Nagappan, N.: Understanding network failures in data centers: measurement, analysis, and implications. In: Proceedings of the ACM SIGCOMM 2011 Conference, New York, USA, 2011, pp. 350–361. https://doi.org/10.1145/2018436.2018477

  20. Srinivasan, S.M., Truong-Huu, T., Gurusamy, M.: TE-based machine learning techniques for link fault localization in complex networks. In: 2018 IEEE 6th International Conference on Future Internet of Things and Cloud (FiCloud), Aug 2018, pp. 25–32. https://doi.org/10.1109/FiCloud.2018.00012

  21. Srinivasan, S.M., Truong-Huu, T., Gurusamy, M.: Machine learning-based link fault identification and localization in complex networks. IEEE Internet Things J. 6(4), 6556–6566 (2019). https://doi.org/10.1109/JIOT.2019.2908019

    Article  Google Scholar 

  22. Ayoubi, S., Limam, N., Salahuddin, M.A., Shahriar, N., Boutaba, R., Estrada-Solano, F., Caicedo, O.M.: Machine learning for cognitive network management. IEEE Commun. Mag. 56(1), 158–165 (2018). https://doi.org/10.1109/MCOM.2018.1700560

    Article  Google Scholar 

  23. Boutaba, R., Salahuddin, M.A., Limam, N., Ayoubi, S., Shahriar, N., Estrada-Solano, F., Caicedo, O.M.: A comprehensive survey on machine learning for networking: evolution, applications and research opportunities. J. Internet Serv. Appl. 9(1), 16 (2018). https://doi.org/10.1186/s13174-018-0087-2

    Article  Google Scholar 

  24. Tayal, A., Hubballi, N., Natu, M., Sadaphal, V.: Congestion-aware probe selection for fault detection in networks. In: 2018 10th International Conference on Communication Systems Networks (COMSNETS), Jan 2018, pp. 407–409. https://doi.org/10.1109/COMSNETS.2018.8328229

  25. Bombal, D., Duponchelle, J.: Getting Started with GNS3–GNS3, GNS3 Documentation, Jan. 25, 2019. https://docs.gns3.com/1PvtRW5eAb8RJZ11maEYD9_aLY8kkdhgaMB0wPCz8a38/index.html#h.a45sndw9oea8. Accessed 14 Jul  2019

  26. Potharaju, R., Jain, N.: When the network crumbles: an empirical study of cloud network failures and their impact on services. In: Proceedings of the 4th Annual Symposium on Cloud Computing, New York, USA, 2013, pp. 15:1–15:17. https://doi.org/10.1145/2523616.2523638

  27. Claise, B.: Cisco Systems NetFlow Services Export Version 9, RFC Editor, RFC 3954, Oct 2004.  http://www.rfc-editor.org/rfc/rfc3954.txt

  28. Gerhards, R.: The Syslog Protocol, RFC Editor, RFC 5424, Mar 2009.  http://www.rfc-editor.org/rfc/rfc5424.txt

  29. Rezaei, S., Radmanesh, H., Alavizadeh, P., Nikoofar, H., Lahouti, F.: Automatic fault detection and diagnosis in cellular networks using operations support systems data. In: NOMS 2016–2016 IEEE/IFIP Network Operations and Management Symposium, Apr. 2016, pp. 468–473. https://doi.org/10.1109/NOMS.2016.7502845

Download references

Acknowledgements

This work has been developed thanks to the support of Telematics Engineering Group (GIT) of the University of Cauca and Systems Control, Learning and Optimization group (CAOS) of the Carlos III University of Madrid. The authors are grateful to the following Colombian institutions for funding the Ph.D. project in which this work was developed: Administrative Department of Science, Technology, and Innovation -COLCIENCIAS- (call for national doctorates No. 647–2014) and the Vice-Rectorate for Research of the University of Cauca (Project ID 4660). This work has been also supported by the Spanish Government under Projects TRA2016-78886-C3-1-R and PID2019-104793RB-C31.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Angela M. Vargas-Arcila.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vargas-Arcila, A.M., Corrales, J.C., Sanchis, A. et al. Peripheral Diagnosis for Propagated Network Faults. J Netw Syst Manage 29, 14 (2021). https://doi.org/10.1007/s10922-020-09579-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10922-020-09579-0

Keywords

Navigation