Abstract
The rapid development and deployment of network services has brought a series of challenges to researchers. On the one hand, the needs of Internet end users/applications reflect the characteristics of travel alienation, and they pursue different perspectives of service quality. On the other hand, with the explosive growth of information in the era of big data, a lot of private information is stored in the network. End users/applications naturally start to pay attention to network security. In order to solve the requirements of differentiated quality of service (QoS) and security, this paper proposes a virtual network embedding (VNE) algorithm based on deep reinforcement learning (DRL), aiming at the CPU, bandwidth, delay and security attributes of substrate network. DRL agent is trained in the network environment constructed by the above attributes. The purpose is to deduce the mapping probability of each substrate node and map the virtual node according to this probability. Finally, the breadth first strategy (BFS) is used to map the virtual links. In the experimental stage, the algorithm based on DRL is compared with other representative algorithms in three aspects: long term average revenue, long term revenue consumption ratio and acceptance rate. The results show that the algorithm proposed in this paper has achieved good experimental results, which proves that the algorithm can be effectively applied to solve the end user/application differentiated QoS and security requirements.
Similar content being viewed by others
1 Introduction
The rapid development of network technology and the blowout growth of end users/applications have brought greater opportunities and challenges to infrastructure providers (InPs) and service providers (SPs) [1, 2]. With the 5G era, many network applications are based on virtual network architecture. In this architecture, the implementation of network functions no longer depends on specific hardware facilities, but on the way of software programming to achieve flexible deployment of virtual functions [3,4,5]. For example, the realization of intelligent applications, such as the Internet of vehicles (IoV), intelligent medical, military unmanned aerial vehicle (UAV), cannot do without a strong virtual network infrastructure as support [6,7,8]. It cannot be ignored that there are a series of challenges in using network virtualization (NV) technology to provide services for these intelligent applications [9, 10].
On the one hand, the QoS is closely related to the needs of network end users/applications. With the growth of network end users/applications and the expansion of network services, the QoS requirements of users/applications show the characteristics of differentiation [11]. For the users of the IoV or UAVs, their primary demand is the real-time and accurate control of the vehicles or UAVs [12]. Driverless cars need to judge the road conditions and make accurate decisions in time to avoid traffic accidents [13, 14]. UAV needs timely command to strike the target accurately. So they need InPs to provide low latency network services. For the network video live users, they need to use the network to carry out live activities. This type of application requires the network to provide a large amount of bandwidth in a short period of time to ensure the smoothness of the video picture, so it puts forward a high bandwidth service demand for the InPs. In addition, it also includes low cost, low memory consumption and other different QoS requirements. Therefore, large-scale end users/applications put forward differentiated QoS requirements for InPs [15,16,17,18].
On the other hand, with the mutual penetration of network technology and daily life, people’s demand of using network to store personal information is more and more intense [19,20,21,22,23]. For bank accounts, health monitoring and electronic payment, these important private information is not expected to be exposed or stolen, and this information is the target of malicious devices or malware attacks [24, 25]. Therefore, the security of network services should be considered when satisfying the differentiated QoS requirements of network end users/applications [26,27,28,29].
In the NV environment, the requirements of network end users/applications are presented in the form of virtual network requests (VNRs). The SP is responsible for sending the VNR of the network end user/application. The purpose is to hope that the InP can allocate sufficient underlying network resources to meet the demand, that is, the problem of VNE [30,31,32]. This paper focuses on the problem of VNE with differentiated QoS requirements and security. According to the functional requirements of different users, this paper focuses on the design of differentiated QoS VNE algorithm from three aspects of VNE cost, network bandwidth and delay. In view of the security problems exposed in the process of VNE, a reasonable security level is set for VNR and substrate network. A VNE algorithm is designed to meet the requirements of differentiated QoS and security.
In order to improve the decision-making and optimization ability of VNE algorithm, deep learning method and reinforcement learning method are applied. The deep learning method is mainly used to solve the decision-making problem in high-dimensional space. By imitating the biological neural network to establish the network model, the problem that needs to be decided is input into the neural network and finally the optimal solution of the problem can be obtained [33, 34]. The reinforcement learning method mainly emphasizes the learning and training of agents in the interaction with the environment, and realizes the optimization of decision-making by using the evaluation feedback signal. In reinforcement learning method, training set data is usually used to train the agent and the agent adjusts the action through the size of feedback signal, so as to achieve the effect of continuous optimization. Finally, the optimal result can be obtained in the test data set [35,36,37]. The embedded problem of virtual network has been proved to be NP-hard [38], so deep reinforcement learning can be used to find the optimal solution for the embedded problem of virtual network to meet the differentiated QoS and security needs of network end users/applications.
The main work of this paper is as follows:
-
(1)
Aiming at the differentiated QoS and security requirements of network end users/applications, this paper applies the representative DRL algorithm to the VNE problem. Through the efficient training results of agents, the differentiated QoS and security requirements of network end users/applications can be effectively solved.
-
(2)
In the DRL algorithm, we extract four important attributes for the substrate network: CPU, bandwidth, delay and security. Using the custom policy network as the agent, the feature matrix is used as the input of the policy network for training. In this way, agents can be trained in a more realistic network environment, and the experimental results are also optimal. Finally, the mapping probability of each substrate node can be obtained.
-
(3)
In order to prove the effectiveness of the algorithm, the VNE algorithm based on DRL is compared with other representative algorithms in three aspects of long term average revenue, long term revenue consumption ratio and acceptance rate. The experimental results show that the algorithm achieves good results and proves the effectiveness of the algorithm.
The rest of this paper is organized as follows: The second part describes the research status of VNE algorithms for differentiated QoS and security. The third part describes the problem of VNE and establishes the network model. The fourth part describes the implementation process of VNE algorithm based on differentiated QoS and security requirements. The fifth part introduces the setting of simulation experiment, then shows the experimental results and analyzes them. The last part summarizes the whole paper.
2 Related work
2.1 VNE algorithms based on differentiated QoS
In reference [39], a dynamic heuristic algorithm is proposed, which focuses on receiving as many VNRs as possible instead of optimizing the QoS performance of each VNR. When the QoS of VNR is not satisfied, the algorithm will drive the re-embedding scheme of heuristic algorithm to meet the given QoS requirements. Reference [40] considers that the existing solutions only aim at the congestion control problem of single objective VNE, and proposes a multi-objective VNE solution. Aiming at energy saving, energy sensing, avoiding network congestion and other service indicators, the embedding process of virtual network is completed by combining the heuristic solution method based on SDN. In reference [41], a dynamic network resource allocation method based on load balancing and QoS is proposed. In this paper, the author proposes a QoS based scheduling mechanism for VNRs, which can reasonably rank incoming services by calculating the priority of VNRs. At the same time, this method uses a resource allocation mechanism based on load balancing to avoid the imbalance of resource consumption. In reference [42], an intelligent delay aware VNE scheme iVNE is proposed. This scheme focuses on the problem that the existing embedding algorithm of virtual network is not necessarily the optimal algorithm of industrial wireless network, but also lacks the QoS capacity. It provides delay guarantee for various industrial virtual networks, including static embedding process and dynamic forwarding process. Finally, the VNE algorithm achieves good load balancing ability. Reference [43] pays attention to the resource constraints and service quality issues of the IoV scenario. Based on artificial intelligence and machine learning, the author pushes cache and communication resources to the edge of smart cars, and jointly realizes the offloading of roadside units (RSU). The author uses a mixed integer nonlinear programming (MINLP) model to reduce the total network delay. The final experimental results prove that this method is effective in reducing user communication, computing, network congestion and content download delays.
2.2 VNE algorithms based on security
In reference [44], network function virtualization (NFV) technology is applied to the field of network security. The network services processed by virtual network security function chain may be sensitive to specific network requirements, such as maximum bandwidth or minimum delay. The author proposes a gradual security service embedding scheme, which can optimize the resource utilization and deploy the virtual security function chain efficiently according to the security requirements of a single application and the strategy of the operator. In reference [45], a heuristic security aware VNE algorithm (SA-VNE) is proposed. The algorithm uses TOPSIS to sort the importance of the base nodes and select the most suitable base nodes. Finally, the shortest path algorithm is used to complete the link mapping process. In reference [46], trust relationship and trust degree are introduced into the problem of VNE, and the security problems in NV environment are analyzed quantitatively. This paper proposes a trust aware security VNE algorithm, which considers the local and global importance of nodes in the mapping process, and uses the approximate ideal ranking method to sort the substrate nodes. Finally, the k-shortest path method is used to complete the link mapping. In reference [47], the mapping process of security virtual network is modeled as a multi-objective mixed integer linear programming model, and a mapping algorithm of security virtual network based on multi-attribute comprehensive evaluation of nodes and path optimization is proposed. In this algorithm, the resource richness, security attributes and topological proximity of nodes are regarded as the criteria of node selection. In the link mapping phase, the available bandwidth and the number of path hops are used as the evaluation objects to select the mapping link. Finally the whole VNE process is completed.
From the above VNE algorithm based on differentiated QoS requirements and the security VNE algorithm, the existing VNE algorithm has done more perfect work. It cannot be ignored that they still have the following problems. First of all, in the research of VNE for QoS requirements, the author does not clearly point out which specific QoS indicators are. They regard the whole QoS as an evaluation standard, which does not reflect the characteristics of differentiation. There are no specific examples of differentiated QoS that need to be solved in the existing research, lacking of universal practical significance. Secondly, the research of security VNE algorithm only uses the traditional heuristic method. With the rapid development of intelligent learning method, it is of great significance to apply it to the problem of VNE. The intelligent learning method can allocate the network resources satisfying the security characteristics of VNRs, which has obvious advantages over the traditional heuristic method. Finally, there is no research on VNE algorithm which combines differentiated QoS requirements and security. In this paper, we will combine the differentiated QoS requirements and security issues, and apply DRL method to study the VNE algorithm.
3 Description and model establishment of VNE problem with differentiated QoS and security
3.1 Description of VNE problem with differentiated QoS and security
The different network functional requirements of network end users/applications are multiple heterogeneous VNRs. To allocate the underlying network resources reasonably for these VNRs, this process is called VNE. The embedding of virtual network can be divided into two parts: node embedding and link embedding [48, 49].
In the node embedding stage, virtual nodes need to find substrate nodes to meet their resource requirements. In order to better solve the problem of VNE with differentiated QoS requirements, we will focus on the cost of CPU resource consumption and the impact of node delay on QoS. For the security problem, the security requirement level attribute will be set for each virtual node, and each virtual node can only be embedded in the substrate node that meets its security requirements. Therefore, CPU resource, delay and security are three important indexes in the node attribute setting.
In the stage of link embedding, virtual link needs to find a substrate link to meet its resource requirements. A virtual link can be embedded in one substrate link or multiple substrate links through path segmentation. In order to better solve the problem of VNE with differentiated QoS requirements, we will focus on the cost of bandwidth consumption and the impact of link delay on QoS. Therefore, bandwidth resource and delay are two important indexes in link attribute setting.
In the whole process of VNE, CPU, bandwidth, delay and security are taken as the starting point to solve the problem of VNE with differentiated QoS requirements and security. By setting reasonable attributes for nodes and links, a reliable solution is provided for the problem of VNE facing differentiated QoS requirements and security.
3.2 Network models
The undirected weighted graph \(G^V=\{N^V,L^V,A_N^V,A_L^V\}\) is used to model the virtual network. \(G^V\) represents a separate VNR. \(N^V\) represents the collection of virtual nodes in the VNR, and \(n^v\) represents a certain virtual node. \(L^V\) represents the virtual link set in the VNR, and \(l^s\) represents one of the determined virtual links. \(A_N^V\) represents the attribute set of virtual node. For a specific virtual node \(n^v\), its attributes include CPU resource requirement \(CPU(n^v)\), delay level requirement \(DELAY(n^v)\) and security level requirement \(SR(n^v)\). \(A_L^V\) represents the attribute set of virtual link. For a specific virtual link \(l^v\), its attributes include bandwidth resource requirement \(BW(l^v)\) and delay level requirement \(DELAY(l^v)\).
The undirected weighted graph \(G^S=\{N^S,L^S,A_N^S,A_L^S\}\) is used to build the mathematical model for the substrate network. \(G^S\) represents the entire substrate network. \(N^S\) represents the node set of the substrate network, and \(n^s\) represents a certain substrate node. \(L^S\) represents the set of links in the substrate network, and \(l^s\) represents a certain substrate link. \(A_N^S\) represents the attribute set of substrate nodes. For a specific substrate node \(n^s\), its attributes include available CPU resource \(CPU(n^s)\), delay level \(DELAY(n^s)\) and security level \(SL(n^s)\). \(A_L^S\) represents the attribute set of the substrate link. For a specific physical link \(l^s\), its attributes include the available bandwidth resource \(BW(l^s)\) and delay level \(DELAY(l^s)\).
Figure 1 shows the topology of a virtual network and a substrate network. The virtual network consists of two virtual nodes and one virtual link. The three numbers next to the virtual node represent the CPU resource demand, delay demand level and security demand level of the virtual node. The two numbers on the virtual link represent the bandwidth resource demand and delay demand level of the virtual link. For a substrate network, the three numbers next to each substrate node represent the current available CPU resources, delay level and security level of the substrate node. Two numbers on each substrate link represent the amount of bandwidth resources and delay level that the substrate link can provide.
3.3 Constraints
The embedding of VNRs is limited by the number of underlying network resources. It is impossible for virtual network to be embedded on the underlying network without limitation. The successful embedding of every VNR consumes a certain amount of network resources, mainly including the CPU resources of nodes and the bandwidth resources of links. In addition, due to the need for collaborative consideration of differentiated QoS and security of VNE, we stipulate that virtual nodes and virtual links can only be mapped to substrate nodes and substrate links whose delay level is not greater than its delay demand level. At the same time, it is necessary to ensure that the virtual node maps to a substrate node whose security level is not less than its security requirement level. With the above constraints, we can ensure that the VNE meets the differentiated QoS requirements and security requirements of CPU, bandwidth and delay. We will be involved in the formulation of constraints.
The current available CPU resources of substrate node \(n^s\) can be represented by the remaining CPU resources:
\(CPU(n^s)\) represents the current remaining CPU resources of substrate node \(n^s\). \(CPU_{initial}(n^s)\) represents the initial total CPU resources of substrate node \(n^s\). \(\sum _{all(n^v \uparrow n^s )}CPU(n^s)\) represents the CPU resources consumed by all virtual nodes mapped to substrate node \(n^s\). The symbol \(n^v \uparrow n^s\) indicates that virtual node \(n^v\) is mapped to substrate node \(n^s\).
The currently available bandwidth resources of substrate link \(l^s\) can be represented by the remaining bandwidth resources:
\(BW(l^s)\) represents the current remaining bandwidth resources of substrate link \(l^s\). \(BW_{initial}(l^s)\) represents the initial total bandwidth resources of substrate link \(l^s\). \(\sum _{all(l^v \uparrow l^s)}BW(n^s)\) represents the bandwidth resources consumed by all virtual links mapped to substrate link \(l^s\). The symbol \(l^v \uparrow l^s\) indicates that virtual link \(l^v\) is mapped to physical link \(l^s\).
Formula (3) and formula (4) respectively represent the CPU resource constraint and bandwidth resource constraint embedded in the virtual network.
Formula (5) and formula (6) respectively represent the node delay constraint and link delay constraint embedded in the virtual network. Virtual node \(n^v\) can only be mapped to substrate node \(n^s\) which is not greater than its delay requirement level. Virtual link \(l^v\) can only be mapped to substrate link \(l^s\) which is not greater than its delay requirement level.
Formula (7) represents the security constraints embedded in the virtual network. We set security requirement level for each virtual node and security level for each substrate node. Virtual node \(n^v\) can only be mapped to substrate node \(n^s\) which is not less than its security requirement level.
3.4 Evaluating indicators
We take the long-term average revenue, acceptance rate and long-term revenue consumption ratio of VNE as the indexes to evaluate the performance of the algorithm. Because these three indicators can reflect the differentiated QoS requirements of CPU resource consumption, bandwidth resource consumption, delay constraint and security constraint to a certain extent.
The revenue of VNE is as follows:
where \(R(G^V,t)\) represents the revenue embedded in the virtual network within the time period t of the arrival of the VNR. \(\sum _{n^v \in N^V}CPU(n^v)\) represents the CPU resource revenue obtained from the CPU resource consumed by virtual node \(n^v\). \(\sum _{l^v \in L^V}BW(l^v)\) represents the bandwidth resource revenue corresponding to the bandwidth resource consumed by the virtual link \(l^v\). The revenue of VNE are determined by the sum of CPU resources and bandwidth resources consumed by VNRs.
The long-term average revenue of embedded virtual network is as follows:
The consumption of VNE is expressed as follows:
where \(C(G^V,t)\) represents the consumption of VNE within the time period t of VNR arrival. \(\sum _{n^v \in N^V}CPU(n^v)\) represents the CPU resources consumed by virtual node \(n^v\). \(\sum _{l^v \in L^V}BW(l^v)\times hops(l^v)\) represents the bandwidth resources consumed by the virtual link \(l^v\). Since a virtual link may be divided into multiple substrate links, \(hops(l^v)\) represents the number of hops of the virtual link \(l^v\). The consumption of VNE is determined by the sum of CPU resources and bandwidth resources consumed by VNRs.
The ratio of long-term revenue consumption embedded in virtual network is expressed as follows:
The acceptance rate of VNE is expressed as follows:
where \(\sum _{t=0}^{T}num(VNR_{acc})\) represents the number of VNRs successfully embedded in the time range t. \(\sum _{t=0}^{T}num(VNR_{arr})\) represents the total number of VNRs that arrive in the time range t.
3.5 An example
Figure 2 shows the different situations of two kinds of VNE. In case 1, the node mapping relationship is \(a \uparrow A\), \(b \uparrow B\), and the virtual link does not have path segmentation at this time. In case 2, the node mapping relationship is \(a \uparrow F\), \(b \uparrow D\), and the virtual link has path splitting. In both cases, the successful embedding of VNRs consumes the corresponding CPU resources and bandwidth resources. The delay constraints and security constraints are also satisfied. But in both cases, the ratio of revenue to consumption is different.
In case 1, the revenue of VNE is:
The consumption of VNE is as follows:
So the ratio of revenue to consumption in case 1 is 1.
In case 2, the revenue of VNE is:
The consumption of VNE is as follows:
So the ratio of revenue to consumption in case 2 is \(\frac{25}{32}\).
It can be seen that path segmentation will result in greater cost of resource consumption. Therefore, in the design of VNE algorithm, it should try to avoid the generation of path segmentation.
4 Implementation of VNE algorithm based on differentiated QoS and security requirements
4.1 The framework of VNE algorithm based on DRL
With the continuous improvement of intelligent learning algorithm performance and the expansion of its application range, the application of DRL method to VNE algorithm will become the mainstream to solve the problem of VNE. The key of using DRL method to solve the problem of VNE lies in which kind of neural network is used to train the agent and how to create a realistic network environment for the agent. The embedded algorithm framework of virtual network based on DRL is shown in Fig. 3.
In order to train the DRL agent in a more real network environment, we extract four important attributes of the substrate network according to the differentiated QoS requirements and security requirements, so as to build the agent training environment. Feature extraction is described in detail below. DRL combines the advantages of deep learning and reinforcement learning, and agents can be well trained in our own policy network. Finally, a good result of VNE can be obtained.
4.2 Network feature extraction
The purpose of feature extraction of substrate network is to create a more real network environment for DRL agent. The DRL agent can make the best decision only when it is trained in the substrate network environment. Under the NV architecture, the substrate network is very complex and the network features available for extraction are very rich. Considering the computing power of our own policy network, if we extract too many network features, it will cause great computational complexity and reduce the performance of the algorithm. For differentiated QoS requirements and security requirements, we extract the following four attributes for each physical node:
(1) CPU resources. CPU resource is one of the most important resources in the network environment, which is an important factor affecting the cost of VNE. The CPU resource calculation method of the substrate node is shown in formula (1).
(2) The sum of bandwidth. The bandwidth sum of all links connected to a substrate node, which can reflect the bandwidth resource requirements of users. The link bandwidth sum connected to the substrate node is expressed as:
(3) Delay. We set the delay attribute for both the substrate node and the substrate link. Virtual nodes and links can only be mapped to substrate nodes and links that are no higher than their delay requirements [50]. The specific expression is shown in formula (5).
(4) Safety level. We set the security level for each substrate node. Virtual nodes can only be mapped to substrate nodes that are no lower than the security requirement level [51]. The specific expression is shown in formula (7).
CPU resources and bandwidth resources are the main resources consumed by VNRs. The evaluation criteria for evaluating the performance of VNE algorithm are also designed based on these two attributes. In order to meet the differentiated QoS requirements of network end-users/applications, it is necessary to consider the bandwidth resources and delay together. They deal with different QoS scenarios respectively. In addition, in view of the security problems in the network environment, innovatively considering the security attribute in the VNE problem can provide a new idea for the security of the VNE algorithm.
The extracted node attributes are concatenated into a four-dimensional feature vector. The feature vector of substrate node \(n^s\) is expressed as:
By combining the feature vectors of all substrate nodes, a four-dimensional feature matrix can be obtained. As the input of the policy network, the DRL agent is trained in the environment of the feature matrix. The feature matrix is expressed as:
4.3 Policy network construction
We use the basic elements of artificial neural network to build a simple policy network as a DRL agent. The extracted feature matrix is used as the input of the policy network, and the agent learns and trains in this environment, in order to deduce the probability of each substrate node being mapped. The resulting policy network is shown in Fig. 4. It mainly includes input layer, convolution layer, softmax layer and output layer.
The input layer is used to receive the feature matrix extracted from the substrate network, then use the policy network to evaluate the node attributes in the feature matrix. Convolution operation is performed on the feature vectors in the convolution layer to obtain the available resource vector \(r_i\) of each eigenvector. The operation method is as follows:
where \(r_i\) is the available resource vector of the ith feature vector. \(\omega \) is the convolution kernel weight vector of the convolution layer. \(v_i\) is the ith feature vector and o is the offset.
In the softmax layer, a probability is generated for each node according to the available resource vector of each node, and the substrate nodes are ordered according to this probability. Softmax function is the extension of logical regression. It can convert n-dimensional vector into real value between 0 and 1 [52]. The calculation method is as follows:
where \(p_i\) represents the mapping probability of the ith substrate node. k is the total number of feature vectors.
Finally, a set of available substrate nodes and their mapping probabilities are output at the output layer.
4.4 Training and testing
In the training phase, whenever a VNR arrives, the policy network will extract a feature matrix from the substrate network as the input, so the feature matrix is dynamic. According to the input feature matrix, the agent can learn the most real situation of the substrate network and make the optimal decision. In reinforcement learning, in order to encourage the agent to make the best decision, a reward signal is usually set for the agent. The agent will decide which action to take according to the size of the reward signal. The size of the reward signal also depends on whether the agent’s action is beneficial to it. The two are interactive. If an action taken by an agent receives a larger reward, the agent will take similar actions to accumulate the reward. In the problem of VNE, the revenue consumption ratio of VNE is usually used as the reward signal of agent. On the one hand, as an important evaluation index of VNE, the index has certain representativeness. On the other hand, the index reflects the utilization rate of the underlying network resources to a certain extent. There is a positive correlation between revenue consumption ratio and reward signal.
According to the method of supervised learning, a manual label is introduced for each feature vector in the policy network. Suppose that the manual label is introduced for the ith feature vector, then the label is 0 except that the ith position is 1. That is:
The cross entropy loss is calculated as follows:
where \(label_i\) and \(c_i\) are the ith element of label and the output of policy network respectively.
Using back propagation to calculate the gradient of parameters in the policy network:
where \(\alpha \) is the size of training gradient. r is the size of reward signal. \(g_s\) is the stacking gradient.
The training process of DRL agent is shown in algorithm 1.
In the test phase, according to the node mapping probability obtained in the training phase, the virtual nodes are mapped in turn. Finally, the BFS is used to map the virtual links. The test process is shown in algorithm 2.
5 Experimental setup and result analysis
In this part, we first introduce the setting of the experimental environment, then we will show the experimental results and analyze them.
5.1 Experimental setup
We build a medium scale substrate network with 100 substrate nodes and 570 substrate links. The initial CPU resources of each substrate node are evenly distributed between 50 units and 100 units. In order to reflect the different requirements of users for delay characteristics, we set the delay level for each substrate node. The delay level is evenly distributed between 1 and 3. In order to reflect the user’s demand for network security, we set the security level for each physical node. The security level is evenly distributed between 1 and 3. The initial bandwidth resources of each substrate link are evenly distributed between 50 units and 100 units, and the delay level is evenly distributed between 1 and 3.
We generated 2000 VNRs, of which the first 1000 were used as training sets and the last 1000 as test sets. Each VNR contains 2 to 10 different nodes, each node has a 50% probability of interconnection. The CPU resource demand of each virtual node is evenly distributed from 1 unit to 50 units. In order to reflect the different requirements of users for delay characteristics, we set the delay requirement level for each substrate node. The delay demand level is evenly distributed from 1 to 3. In order to reflect the user’s demand for network security, we set the security requirement level for each virtual node. The security demand level is evenly distributed from 1 to 3. The bandwidth resource demand of each virtual link is evenly distributed between 1 unit and 50 units, and the delay demand level is evenly distributed between 1 and 3.
The arrival process of VNRs is simulated by Poisson distribution. The average arrival time of every 100 time units reaches 4 VNRs and the duration of each request follows the exponential distribution. We trained 100 epoch agents with gradient descent method, and the learning rate was set to 0.005.
The above experimental data are summarized in Table 1.
5.2 Results and analysis
5.2.1 Training results and analysis
Because the problem of VNE has been proved to be NP-hard and uncertain, the ultimate goal is to find an optimal solution. In order to achieve good results, we put the DRL agent on 100 epoch training sets. An epoch refers to the process of sending all the data into the network to complete a complete training. Through training, we get the stability degree of agent in three indexes: long term average revenue, long term revenue consumption ratio and VNR acceptance rate. The results are shown in Fig. 5.
In the initial stage of training, since the parameters of the policy network are randomly initialized, the agent will randomly take actions to explore the possibility of achieving good results, and the stability is poor at this time. In the middle of the training stage, as the agent becomes more and more familiar with the network environment, the agent will continue to find good solutions. At this time, the agent will get a larger reward signal. Meanwhile, the stability of the agent begins to improve gradually. In the later stage of training, the learning ability of the agent is limited by the performance of the policy network. At this time, the agent accumulates a certain degree of rewards, and the actions taken gradually tend to be stable, so the fluctuation range is small. The gradual stability of the curve proves the effectiveness of agent training, which lays a good foundation for the application of DRL agent in test set.
Figure 6 shows the change of cross entropy loss during the training phase. It can be seen from the figure that the loss value of training keeps decreasing, which also proves that the training of policy network is effective.
It can be seen from the figure that the long term average revenue embedded in the virtual network and the acceptance rate of VNRs are decreasing with the increase of time, because these two indicators are limited by the number of network resources. When the underlying network resources are consumed, the number of VNRs it can carry will continue to decrease. Therefore, the long term average revenue and acceptance rate will show a downward trend. The change of the revenue consumption ratio embedded in the virtual network has nothing to do with the quantity of the underlying network resources, so the index does not show a downward trend.
5.2.2 Test results and analysis
After the training, the DRL agent is put in the test set to test, so as to prove the effectiveness of the algorithm. Since the mapping probability of each substrate node is obtained in the training phase, the mapping of virtual nodes is directly based on the probability in the test phase.
We compare the DRL-VNE algorithm based on differentiated QoS and security requirements (QS-DRL-VNE) with BASELINE algorithm [53], BL-VNE algorithm [45] and CNL-VNE algorithm [54]. BASELINE algorithm is a typical VNE algorithm based on intelligent learning method. BL-VNE algorithm is based on the mapping cost of virtual network. CNL-VNE algorithm is a security VNE algorithm. The core ideas of several algorithms are listed in Table 2. The experimental results show the performance of the algorithm in three aspects: long term average revenue, long term revenue consumption ratio and VNR acceptance rate, as shown in Fig. 7.
According to the comparison results of the three indexes, the DRL-VNE algorithm based on differentiated QoS and security requirements performs better than the other three algorithms, for two main reasons. First, based on differentiated QoS and security requirements, the DRL-VNE algorithm adopts efficient intelligent learning method to serve the problem of VNE, and efficiently solves the decision-making and optimization process of VNE. Compared with BL-VNE algorithm and CNL-VNE algorithm, intelligent learning algorithm has more advantages. Secondly, the DRL-VNE algorithm based on differentiated QoS and security requirements reasonably extracts the substrate network features, so that the DRL agent can be trained in a more real network environment, and finally achieves better results in the test set compared with the BASELINE algorithm. The above results show that the DRL-VNE algorithm based on differentiated QoS and security requirements is effective.
6 Conclusion
This paper combines the DRL algorithm with the VNE algorithm, and creatively solves the differentiated QoS and security requirements of network end users/ applications. In the framework of NV, the QoS and security requirements of network end users/applications are ultimately the problem of VNE. Using DRL method can improve the decision-making and optimization ability of VNE algorithm.
We attribute the QoS and security requirements of network end users/applications to four network indicators: CPU, bandwidth, delay and security. The DRL agent is trained in the network environment composed of these four attributes and finally the mapping probability of each substrate node is obtained. The experimental results show that it is feasible to solve the problem of VNE by this method and good results have been achieved. Therefore, it is of great practical significance to apply the algorithm to solve the differentiated QoS and security requirements of network end users/applications.
References
Zhang P, Yao H, Qiu C, Liu Y (2018) Virtual network embedding using node multiple metrics based on simplified ELECTRE method. IEEE Access 6:37314–37327
Kumar N, Aujla GS, Garg S, Kaur K, Ranjan R, Garg SK (2019) Renewable energy-based multi-indexed job classification and container management scheme for sustainability of cloud data centers. IEEE Trans Ind Inform 15(5):2947–2957
Ning Z, Dong P, Wang X et al (2020) When deep reinforcement learning meets 5g-enabled vehicular networks: a distributed offloading framework for traffic big data. IEEE Trans Ind Inform 16(2):1352–1361
Du J, Jiang C, Zhang H, Ren Y, Guizani M (2018) Auction design and analysis for SDN-based traffic offloading in hybrid satellite-terrestrial networks. IEEE J Sel Areas Commun 36(10):2202–2217
Munoz R et al (2015) Integrated SDN/NFV management and orchestration architecture for dynamic deployment of virtual SDN control instances for virtual tenant networks [invited]. IEEE/OSA J Opt Commun Netw 7(11):B62–B70
Aujla GS, Chaudhary R, Kumar N, Rodrigues JJPC, Vinel A (2017) Data offloading in 5g-enabled software-defined vehicular networks: a Stackelberg-game-based approach. IEEE Commun Mag 55(8):100–108
Batth RS, Nayyar A, Nagpal A (2018) Internet of robotic things: driving intelligent robotics of future—concept, architecture, applications and technologies. In: 2018 4th International Conference on Computing Sciences (ICCS), Jalandhar, pp 151–160
Martins G, Kopp LF, Genta J et al (2019) A Prediction-based multisensor heuristic for the internet of things. In: The 15th ACM International Symposium on QoS and Security for Wireless and Mobile Networks, pp 71–78
Du J, Gelenbe E, Jiang C, Zhang H, Ren Y (2017) Contract design for traffic offloading and resource allocation in software defined ultra-dense networks. IEEE J Sel Areas Commun 35(11):2457–2467
Zhang P, Yao H, Liu Y (2016) Virtual network embedding based on the degree and clustering coefficient information. IEEE Access 4:8572–8580
Sandhu AK, Singh Batth R, Nagpal A (2019) Improved QoS Using Novel Fault Tolerant Shortest Path Algorithm in Virtual Software Defined Network (VSDN). In: 2019 International Conference on Automation, Computational and Technology Management (ICACTM), London, United Kingdom, pp 383–388
Ning Z, Zhang K, Wang X, Obaidat MS et al (2020) Joint computing and caching in 5g-envisioned internet of vehicles: a deep reinforcement learning-based traffic control system. IEEE Trans Intell Transp Syst. https://doi.org/10.1109/TITS.2020.2970276
Jindal A, Aujla GS, Kumar N, Chaudhary R, Obaidat MS, You I (2018) SeDaTiVe: SDN-enabled deep learning architecture for network traffic control in vehicular cyber-physical systems. IEEE Netw 32(6):66–73
Aljeri N, Boukerche A (2019) An efficient handover trigger scheme for vehicular networks using recurrent neural networks. In: The 15th ACM international symposium on qos and security for wireless and mobile networks, pp 85–91
Zhang Y, Ren SQ, Chen SB, Tan B, Lim ES, Yong KL (2013) DifferCloudStor: differentiated quality of service for cloud storage. IEEE Trans Magn 49(6):2451–2458
Xiong B, Yang K, Zhao J, Li W, Li K (2016) Performance evaluation of OpenFlow-based software-defined networks based on queueing model. Comput Netw 102:172–185
Zhang J, Wei W, Chaoquan L, Jin W, Arun KS (2020) Lightweight deep network for traffic sign classification. Ann Telecommun 75(7):369–379
Zhao J, Zhigang H, Xiong B, Li K (2018) Accelerating packet classification with counting bloom filters for virtual openflow switching. China Commun 15(10):117–128
Aujla GS, Jindal A, Kumar N (2018) EVaaS: electric vehicle-as-a-service for energy trading in SDN-enabled smart transportation system. Comput Netw 143:247–262
Vishnu NS, Singh Batth R, Singh G (2019) Denial of service: types, techniques, defence mechanisms and safe guards. In: 2019 international conference on computational intelligence and knowledge economy (ICCIKE), Dubai, United Arab Emirates, pp 695–700
Wang J, Gao Y, Liu W, Wenbing W, Lim S-J (2019) An asynchronous clustering and mobile data gathering schema based on timer mechanism in wireless sensor networks. Comput Mater Contin 58(3):711–725
Wang J, Gao Y, Yin X, Li F, Kim H-J (2018) An enhanced PEGASIS algorithm with mobile sink support for wireless sensor networks. Wireless Communications and Mobile Computing 2018
Ju C, Yu G, Arun KS, Gwang-jun K (2018) A PSO based energy efficient coverage control algorithm for wireless sensor networks. Comput Mater Contin 56(3):433–446
Abhishek NV, Lim TJ, Tandon A, Sikdar B (2018) Detecting forwarding misbehavior in clustered IoT networks. In: The 14th ACM international symposium on QoS and security for wireless and mobile networks, pp 1–6
Ouferhat N, Mellouk A (2006) QoS dynamic routing for wireless sensor networks. In: The 2nd ACM international workshop on Quality of service & security for wireless and mobile networks, pp 45–50
Ahmad I, Namal S, Ylianttila M, Gurtov A (2015) Security in software defined networks: a survey. IEEE Commun Surv Tutor 17(4):2317–2346
Varadharajan V, Karmakar K, Tupakula U, Hitchens M (2019) A policy-based security architecture for software-defined networks. IEEE Trans Inf Forens Secur 14(4):897–912
Aujla GS, Chaudhary R, Kaur K, Garg S, Kumar N, Ranjan R (2019) SAFE: SDN-assisted framework for edge-cloud interplay in secure healthcare ecosystem. IEEE Trans Ind Inform 15(1):469–480
Ziane S, Mellouk A (2005) A swarm intelligent multi-path routing for multimedia traffic over mobile ad hoc networks. In: First ACM international workshop on quality of service and security in wireless and mobile networks, pp 55–62
Zhang P, Yao H, Liu Y (2018) Virtual network embedding based on computing, network, and storage resource constraints. IEEE Internet Things J 5(5):3298–3304
Aujla GS, Chaudhary R, Kumar N, Kumar R, Rodrigues JJ (2018) An ensembled scheme for QoS-aware traffic flow management in software defined networks. In: 2018 IEEE international conference on communications (ICC), pp 1–7. IEEE
Parra OS, Garica G, Reyes B (2014) Traffic forecasting using a multi layer perceptron model. In: 10th ACM Symposium on QoS and security for wireless and mobile networks, pp 133–136
Kaur K, Garg S, Aujla GS, Kumar N, Rodrigues JJPC, Guizani M (2018) Edge computing in the industrial internet of things environment: software-defined-networks-based edge-cloud interplay. IEEE Commun Mag 56(2):44–51
Ning Z, Dong P, Wang X, Rodrigues J, Xia F (2019) Deep reinforcement learning for vehicular edge computing: an intelligent offloading system. ACM Trans Intell Syst Technol 10(6):60
Jiang C, Zhang H, Ren Y, Han Z, Chen K, Hanzo L (2017) Machine learning paradigms for next-generation wireless networks. IEEE Wirel Commun 24(2):98–105
Wang J, Jiang C, Zhang H, Ren Y, Chen K-C, Hanzo L (2020) Thirty years of machine learning: the road to pareto-optimal wireless networks. In: IEEE Communications Surveys & Tutorials. https://doi.org/10.1109/COMST.2020.2965856, Early Access, pp 1–1
Aujla GS, Chaudhary R, Kumar N, Das AK, Rodrigues JJPC (2018) SecSVA: secure storage, verification, and auditing of big data in the cloud environment. IEEE Commun Mag 56(1):78–85
Zhang P, Yao H, Li M, Liu Y (2017) Virtual network embedding based on modified genetic algorithm. Peer-to-Peer Netw Appl 2:1–12
Cao H, Wu S, Aujla GS, Wang Q, Yang L, Zhu H (2020) Dynamic embedding and quality of service-driven adjustment for cloud networks. IEEE Trans Ind Inform 16(2):1406–1416
Pham M, Hoang DB, Chaczko Z (2020) Congestion-aware and energy-aware virtual network embedding. IEEE/ACM Trans Netw 28(1):210–223
Xu S et al (2019) Load-balancing and QoS based dynamic resource allocation method for smart gird fiber-wireless networks. Chin J Electron 28(6):1234–1243
Li M, Chen C, Hua C, Guan X (2019) Intelligent latency-aware virtual network embedding for industrial wireless networks. IEEE Internet Things J 6(5):7484–7496
Ning Z, Zhang K, Wang X, Guo L et al (2020) Intelligent edge computing in internet of vehicles: a joint computation offloading and caching solution. In: IEEE transactions on intelligent transportation systems. https://doi.org/10.1109/TITS.2020.2997832, pp 1–14
Doriguzzi-Corin R, Scott-Hayward S, Siracusa D, Savi M, Salvadori E (2020) Dynamic and application-aware provisioning of chained virtual security network functions. IEEE Trans Netw Serv Manag 17(1):294–307
Zhang P, Li H, Ni Y, Gong F, Li M, Wang F (2019) Security aware virtual network embedding algorithm using information entropy TOPSIS. J Netw Syst Manag 5:1–23
Gong S, Chen J, Huang C, Zhu Q (2015) Trust-aware secure virtual network embedding algorithm. J Commun 36(11):1–10
Liu X, Wang B, Liu S, Yang Z, Zhao Z (2018) Heuristic algorithm for secure virtual network embedding. Syst Eng Electron 40(3):1–6
Du J, Jiang C, Han Z, Zhang H, Mumtaz S, Ren Y (2019) Contract mechanism and performance analysis for data transaction in mobile social networks. IEEE Trans Netw Sci Eng 6(2):103–115
Zhang P (2018) Incorporating energy and load balance into virtual network embedding process. Comput Commun 129:80–88
Aujla GS, Singh A, Kumar N (2020) AdaptFlow: adaptive flow forwarding scheme for software-defined industrial networks. IEEE Internet Things J 7(7):5843–5851
Aujla GS, Singh A, Singh M, Sharma S, Kumar N, Choo KR (2020) BloCkEd: blockchain-based secure data processing framework in edge envisioned V2X environment. IEEE Trans Veh Technol 69(6):5850–5863
Reverdy P, Leonard NE (2016) Parameter estimation in softmax decision-making models with linear objective functions. IEEE Trans Autom Sci Eng 13(1):54–67
Yu M, Yi Y, Rexford J, Chiang M (2008) Rethinking virtual network embedding: substrate support for path splitting and migration. ACM SIGCOMM Comput Commun Rev 38:17–29
Chowdhury NMMK, Rahman MR, Boutaba R (2009) Virtual network embedding with coordinated node and link mapping. In: Proceedings of the IEEE INFOCOM[C]. Rio de Janeiro, pp 783–791
Acknowledgements
This work is partially supported by the Major Scientific and Technological Projects of CNPC under Grant ZD2019-183-006, partially supported by Shandong Provincial Natural Science Foundation under Grant ZR2020MF006, and partially supported by “the Fundamental Research Funds for the Central Universities” of China University of Petroleum (East China) under Grant 20CX05017A.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, C., Batth, R.S., Zhang, P. et al. VNE solution for network differentiated QoS and security requirements: from the perspective of deep reinforcement learning. Computing 103, 1061–1083 (2021). https://doi.org/10.1007/s00607-020-00883-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-020-00883-w