Alarm classification prediction based on cross-layer artificial intelligence interaction in self-optimized optical networks (SOON)
Introduction
The emerging 5G technology and a large number of content-based applications bring unique connectivity challenges in terms of required bandwidth and resource utilization, etc. Optical networks that support the bulk of this transformation continue to grow in architectural complexity and heterogeneity. Optical networks require high availability and reliable operation to continuously serve the end-to-end services. Traditional protective schemes require a lot of investment at multiple layers and under-utilization of available network resources. Even with protective measures, network failures still always happen, which are typically caused by underlying network equipment and optical transmission lines. [1]. Artificial Intelligence (AI) (specifically, supervised machine learning (ML) in this study) algorithms can learn system behavior from the past data and estimate the future responses based on the learned system model. These ML techniques can be used as important support tools for alarm prediction in optical networks.
A method of alarm pre-processing and correlation analysis for optical transport networks is proposed, the results which show this method is promising for trivial alarm identifying, chain alarm mining, and root failure locating in existing optical networks [2]. The main topic of Ref. [3] is failure prediction from large amounts of alarm records stored in different databases of non-cooperating network management systems. The main problems addressed in this paper are the evaluation of alarms, virtual reconstruction of the network and development of tools to overcome the interoperability issues. The motivation behind this work is to assist human operators and minimize the cost of the alarm evaluation process. Ref. [4] uses ML as an instrument to address network assurance via dynamic data-driven operation. A cognitive failure detection architecture for intelligent network assurance in a software-defined network (SDN) controller is proposed and demonstrated based on real-world failure examples. The framework detects and identifies significant failures, and outperforms conventional fixed threshold-triggered operations, both in terms of detection precision and proactive reaction time.
Ref. [5] proposes a performance monitoring and failure prediction method in optical networks. The primary algorithms of this method are the support vector machine (SVM) and the double exponential smoothing (DES). The proposed protection plan primarily investigates how to predict the risk of an equipment failure. Experimental results show that the average prediction precision of our method is 95% when predicting the optical equipment failure state. Ref. [6] shows that most network failures can be predicted by analyzing previous network running logs, using 14 months' network alarm logs from a metropolitan area network. Preliminary research results show that most network failures can be predicted by analyzing previous network running logs. This method can detect failures in practical application on an early-stage and reduce unnecessary economic losses.
At present, optical networks are mainly under centralized control. If all the functions, e.g., data processing, model training, and policy scheduling are performed in the control plane, the controller will suffer a seriously heavy burden [7]. Therefore, it is a feasible way to introduce AI engine to devices in the data plane to share the pressure of the controller. The AI engines can be deployed as embedded modules. In our previous study [8], coordination between AI engines in the control plane and device AI embedded onboard is proposed. Device AI, as the name suggests, is an AI engine embed on a device board, which can be deployed in a cabinet with other communications devices. Device AI is proposed based on edge computing to support various AI applications. The distributed AI engines in devices are used to complete the process of data processing in the data plane, which can greatly reduce the burden of the controller. But in terms of functional diversity and deployment flexibility, single device AI is limited. Therefore, we propose the concept of cross-layer AI. Cross-layer AI is essentially a distributed AI, which is based on the three-layer architecture of SDN. It consists of central AI in the control plane and several device AI engines in the data plane. Driven by network monitoring data and assisted by AI algorithms, network maintenance, and optimization functions are performed.
In this paper, to predict the alarm, we apply an ML-based classifier to estimate whether a particular device alarm will occur, using features such as the input optical power, laser bias current, etc. Compared to our previous work, the number of AI board is increased to form a one-to-many cooperation mode. Data augmentation, model prediction, and other functions are realized through the deployment of device AI. With a favorable accuracy, the amount of data used in this paper is greatly reduced. Monitoring records in an SDH network are used to prove the validity of the method. Under the cross-layer AI architecture, system functions such as data preprocess and data augmentation can be implemented by relevant device AI engines. Then a three-stage approach is proposed for alarm prediction based on cross-layer AI interaction.
- •
Benefiting from the architecture, alarm risk assessment and data augmentation using SMOTE are expediently performed.
- •
Experimental results indicate that the redundant data is effectively filtered out and the class-imbalance problem is solved.
- •
In this context, the predicted results of random forest (RF) algorithm are analyzed from the aspects of precision and validity, compared with the outcomes of SVM [9] and Adaboost [10].
The rest of the paper is organized as follows: Section II describes the architecture of SOON. Alarm classification prediction by cross-layer AI interaction is described in Section III. In Section IV, the experimental setup and results are shown. Finally, Section V concludes the paper.
Section snippets
SOON architecture
Our previous study [11] proposed an operation, administration, and maintenance (OAM)-oriented optical network architecture, i.e., self-optimized optical networks (SOON), to improve the intelligence and automation of network maintenance management. As shown in Fig. 1, the architecture of SOON consists of central AI in the control plane, device AI in the data plane, cognitive decision management (CDM) module, database, and SDN controller [12]. The functions are described as follows. 1) Central AI
Alarm classification prediction by Cross-layer AI
The alarms of optical network equipment are closely related to the equipment states. Equipment states can be quantified by the features collected by the network management system. The features are represented by the physical parameters of the equipment such as optical power, laser current, environmental temperature, power consumption, and other parameters. In general, these data will be recorded in the network management logs. The system periodically collects network performance and alarm
Experimental setup and results
To verify the prediction scheme based on cross-layer AI proposed in this paper, an experiment platform is set up. Two embedded devices, i.e., NVIDIA Jetson-Nano, are used to simulate the AI engine. The offline data is collected from the realistic network. Compared with the AI board (DP8020) used in Ref.[8], Jetson-Nano is compatible with multiple AI frameworks, which is easier to be deployed. Besides, Jetson-Nano could provide 472 GFLOPS(billion floating-point operations per second) and consume
Conclusion
To reduce the burden on the controller during alarm prediction, this paper proposes a novel alarm prediction method based on cross-layer AI, which extends SOON system functions by deploying device AI in the data plane. Cross-layer AI can perform task decomposition and multi-party cooperation to improve processing and system management efficiency. Equipment data collected from the network management system in a commercial SDH network is used to prove the validity of the method. A testbed has
CRediT authorship contribution statement
Bing Zhang: Conceptualization, Software, Writing - original draft. Yongli Zhao: Writing - review & editing. : . Sabidur Rahman: Conceptualization, Methodology. Yajie Li: Writing - review & editing. Jie Zhang: Supervision.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
This work has been supported in part by the National Natural Science Foundation of China (NSFC) (61822105), the Fundamental Research Funds for the Central Universities (2019XD-A05), State Key Laboratory of Information Photonics and Optical Communications of China (IPOC2019ZR01).
References (17)
An overview on application of machine learning techniques in optical networks
IEEE Commun. Surv. Tutorials
(2018)Dealing with alarms in optical networks using an intelligent system
IEEE Access
(2019)- Jaudet, Mohammad, Nacem Iqbal, and Amir Hussain. “Neural networks for fault-prediction in a telecommunications...
Cognitive assurance architecture for optical network fault management
J. Lightwave Technol.
(2018)Failure prediction using machine learning and time series in optical network
Opt. Express
(2017)- Zhong, Jiang, Weili Guo, and Zhenhua Wang. “Study on network failure prediction based on alarm logs.” 2016 3rd MEC...
- Liu, Gengchen, et al. “The first testbed demonstration of cognitive end-to-end optical service provisioning with...
Coordination between control layer AI and on-board AI in optical transport networks
IEEE/OSA J. Opt. Commun. Networking
(2019)
Cited by (18)
Detecting accurate parametric intrusions using optical fiber sensors for long-distance data communication system
2023, Optical Fiber TechnologyA rule-based modeling approach for network application availability assessment under dynamic network restoration scheme
2023, Measurement: Journal of the International Measurement ConfederationEWM-FCE-ODM-Based Evaluation of Smart Community Construction: From the Perspective of Residents’ Sense of Gain
2023, Sustainability (Switzerland)Fiber Optic Incidents Detection and Classification with Yolo Method
2023, Proceedings of the 2023 IEEE International Conference on Advanced Systems and Emergent Technologies, IC_ASET 2023Data augmentation to improve performance of neural networks for failure management in optical networks
2023, Journal of Optical Communications and Networking