Abstract

The functionalities, such as connectivity and communication capability of complex networks, are related to the number and length of paths between node pairs in the networks. In this paper, we propose a new path connectivity measure by considering the number and length of paths of the network (PCNL) to evaluate network path connectivity. By comparing the PCNL with the typical natural connectivity, we prove the effectiveness of the PCNL to measure the path connectivity of networks. Because of the importance of the shortest paths, we further propose the shortest paths connectivity measure (SPCNL) based on the number and length of the shortest paths. Then, we use edge-betweenness-based malicious attacks to study the relationship between the SPCNL and network topology in five types of networks. The results show that the SPCNLs of the networks have a significant corresponding relationship and similar changing trend with their network topology heterogeneities with the increase of the number of deleted edges. These findings mean that the SPCNL is positively correlated with the heterogeneity of the network topology, which provides a new perspective for designing complex networks with high path connectivity.

1. Introduction

Complex networks such as power grids, transportation networks, and telecommunication networks provide the flow of current, products, and information essential to develop the economy and protect social security. Vast information collected from the wireless sensor networks has brought great convenience to the production and life of human society [1, 2]. Therefore, it is important to ensure that such networks continue to function properly for the normal operation of society. However, more and more attacks on and failures of complex networks have caused huge losses to people’s production and lives. As some rare occurrences in the past have shown, complex networks are still vulnerable to diverse attacks [36]. To prevent these losses, it is necessary to design robust networks to combat these malicious attacks.

The node connectivity of a network is an important property concerning the ability of the network to maintain its functionality after being attacked by the removal of nodes or edges from the network [7, 8]. Albert et al. [9] studied the changes of the maximal connected component (MCC), i.e., the size of the largest connected subgraph in the remaining network, after a small fraction of the nodes are removed from an exponential network and scale-free network under random attacks and targeted attacks, respectively. They found that scale-free networks display surprising connectivity against random attacks but are extremely vulnerable to targeted attacks, while the exponential networks do not exhibit this property. Schneider et al. [10] introduced a new connectivity measure and used it to devise a method to reconstruct networks against malicious attacks. Their results showed that networks with an “onion-like” structure have significantly high robustness against malicious targeted attacks. Louzada et al. [11] proposed a new measure based on communication efficiency and outlined a procedure to modify any given network to enhance its connectivity via an optimization approach using simulated annealing. Their results showed that high assortativity and the onion-like structure are the characteristics of networks with high node connectivity. Zeng and Liu [12] proposed a link-robustness index to measure the node connectivity of a network by different malicious attack strategies.

The above robustness measures are mainly considered from the view of node connectivity. In fact, the number of paths in the network is also a very important connectivity index. Path diversification in networks is an important mechanism that can be used to select multiple paths between a given node pair to achieve maximum flow robustness. Multipath routing is one way of improving the robustness of the transmitted information [13, 14]. Multi-path selection, control, and other related algorithms have been widely studied against various malicious attacks [1517]. Therefore, networks with a large number of paths can provide a solid physical path foundation for these algorithms. Furthermore, the shortest paths, where the length of the shortest path is the smallest of all paths between two nodes, are the most efficient for the transmission of flow, for example, electrical current, transportation, and communication packets in complex networks from a source node to a termination node. Therefore, the number of shortest paths is also an important topological index for the functionality of a network. Many classical routing algorithms and robustness measures are designed for networks according to the shortest paths [6, 1824].

For a source node and a termination node, there may be several paths between them. When one path is broken, the two nodes can still be connected through other alternative paths. Therefore, the greater the number of paths is, the higher the path connectivity between the two nodes is. With the same number of paths, the shorter the path length, the higher the network efficiency. Therefore, we propose a new measure (PCNL) in this paper to evaluate the network path connectivity by considering the number and length of the paths. The shortest paths between two nodes are of particular importance for a network to provide the fastest and strongest interaction. We also propose the shortest path connectivity (SPCNL) by only considering the number and length of the shortest paths simultaneously. We study the SPCNL of BA networks and ER networks for four groups with different network sizes, where each group has the same average degree. We find that the BA networks have the higher SPCNL. Furthermore, we use Monte Carlo simulations to analyze the path connectivity of the above networks and three other types of networks, which are generated from the BA networks by edge rewiring algorithm, against edge-betweenness-based malicious attacks. The results demonstrate that the SPCNL is positively correlated with the heterogeneity of the network topology.

The rest of the paper is arranged as follows. Section 2 summarizes the related work. In Section 3, we show the effect of the number and length of the paths on the path connectivity and propose the new network path connectivity measure. In Section 4, we study the SPCNL of BA networks and ER networks for four groups with different network sizes. In Section 5, we study the SPCNL of BA networks, ER networks, and three types of networks against edge-betweenness-based malicious attack. We finally give some conclusions in Section 6.

For any two nodes in a connected network, there may be several paths between them. Therefore, the number of the paths has a great impact on the measurement of network connectivity. Oyama and Morohosi [22] proposed a quantitative method for evaluating the stable connectivity of the network-structured system by shortest-path-counting methods. Morohosi [23] proposed a connectivity measure based on the shortest path length distribution and used Monte Carlo methods for the computation of the measure to find the robustness properties of networks. Kobayashi et al. [24] proposed a quantitative robustness measure of a network. They defined the connectivity function and estimated expected edge deletion and node deletion connectivity functions when an arbitrary number of edges or nodes are deleted from the original network by the Monte Carlo method. The above studies set the number of the shortest path between two nodes as one and ignored the fact that there may be multiple shortest paths between two nodes. In reality, the number and length of the shortest paths will have a great impact on the measurement of network connectivity (Section 3).

If the source and destination of a path are the same nodes, the path is called as closed path. The number of closed paths is an important index for complex networks. Wu et al. [25] proposed a connectivity measure by considering the number and length of the closed paths simultaneously. The connectivity measure was defined as follows:where is the number of closed paths with length l and is the eigenvalue of the adjacency matrix for a network. The authors scaled equation (1) and denoted it bywhere N is the number of nodes in a network. They call equation (2) the natural connectivity (Na_C). The authors considered the influence of the number and length of closed paths on the connectivity measure simultaneously. However, the traffic or information in a network is mainly transmitted between two different nodes, not the node itself. Therefore, the number of closed paths from node to node itself may not accurately reflect the path connectivity of the networks. Moreover, to obtain this measure in the form of a graph spectrum, the authors scale the contribution of closed walks by the factorial of the closed path length. The factor of the factorial of the closed path length will lead to inaccurate measurement of network connectivity. In Section 3, we will give an example to demonstrate this problem.

3. Path Connectivity Measure

Given an undirected simple graph G= (V, E), V is the set of nodes and E is the set of edges. A path is a sequence of vertices P=, , , , where , i= 1, 2, …, k. P is also called a path from to . The length of a path is defined as the number of edges it contains. Therefore, the length of path P is k − 1. The distance between two nodes is defined as the length of the shortest path between the two nodes. The maximum distance between any two nodes in a network is called the diameter (D) of the network. The average path length (Avg_L) of a network is defined as the average distance between any two nodes. Path connectivity refers to the ability of a network to make the paths with the same length connected under disturbances caused by paths change. An intuitive notion of path connectivity can be interpreted as the redundancy of paths between nodes. The greater the number of paths is, the less the risk of disconnection is when the paths between nodes are broken by the removal of edges.

Figure 1 shows the three path scenarios between node i and node j, and the length of all paths is l. Figure 1(a) shows that there is only one path between node i and node j, but Figures 1(b) and 1(c) show that there are n paths between node i and node j. Letting the probability that an edge is removed be p, one can obtain the probabilities that the paths with length l between node i and node j are disconnected for the three path scenarios in Figure 1 as follows:where l > 2. When l = 2, it is noted that Figures 1(b) and 1(c) become the same scenarios. From equation (3), one can deduce the probability q that all paths with length l and number n between node i and node j are disconnected belongs to the following range:

Note that q decreases with the increase of n or the decrease of l. One can intuitively understand that the more alternative and disjoint paths there are, the stronger the connectivity and the function of communication or transmission between the two nodes are. Therefore, one can consider the number of paths as a measure for the path connectivity of the networks. For simplicity, we do not distinguish the two scenarios of Figures 1(b) and 1(c) in this paper. Considering the influence of path length on network path connectivity, we propose a path connectivity measure between a pair of nodes based on the number and length of the paths in a connected network as follows:where is the length setting for the paths between node i and node j, is the length of the shortest paths, and is the number of paths with length l. represents the path connectivity ability between node i and node j. The greater the is, the stronger the robustness of connectivity and the function of communication between node i and node j are. It is noted that different settings of will produce different . One can set according to the actual situation of the networks, for example, the restriction distance in network transmission and the diameter D limitation of a network. For a connected network, it has node pairs. One can obtain the mean value of the of all node pairs as follows:

We call the path connectivity based on the number and length of paths (PCNL). One can use to measure the path connectivity of a network. can change monotonically as edges are added or deleted. To prove this, given a network G0, let G1 be the network after adding an edge between node i and node j. Let be the path connectivity of G0 and be the path connectivity of G1 between node i and node j. and are the PCNLs of G0 and G1, respectively. For the same , one can obtain and from equation (5) as follows:where is the length of the shortest path between node i and node j in G0. For the same l, . Therefore, one can obtain >. For any other node pair (m, n), one can deduce that

Thus, .

Figure 2 shows two networks with the same degree distribution. Table 1 shows the characteristic parameters of the two networks, where Tri_num denotes the number of triangles in a network, r denotes the correlation coefficient, D denotes the diameter, and Avg_L denotes the average shortest path length. We denote P_num as the sum of the number of the shortest paths and denote Na_C as the natural connectivity. We obtain the PCNLs of the two networks by setting Lij = 6. For network A, if we remove node 1 or 2, network A will become disconnected. If we remove the edge between node 1 and node 2, the paths in network A will change dramatically. For example, D of network A increased dramatically from 4 to 7. For network B, the removal of any node cannot make it disconnected. In addition, it is intuitive that the removal of any edge will not change the paths of the network significantly. Therefore, it is obvious that the node connectivity and path connectivity of network B are better than those of network A. However, according to Na_C, we draw the opposite conclusion (Table 1). This shows that Na_C has limitation in evaluating path connectivity of the network. From equation (1), one can find that a closed walk of length l = 2 corresponds to an edge. Because the degree distributions of two networks are identical, the contribution of l = 2 to their Na_C is the same for the two networks. For l = 2, 3, 4, and 5 in network A, one can obtain S from equation (1) as follows:

One can find that the factor of the factorial of the closed path length sharply reduces the contribution of path lengths greater than 2 to Na_C. The effect of path length on Na_C is amplified by the factorial. From equation (1), one can find that a closed walk of length l = 3 represents a triangle. There are two triangles in network A and zero in network B. This may be because the Na_C of network A is larger than that of network B. Therefore, one can infer that Na_C has too strong a correlation with the short closed path lengths of the networks. This will lead to inaccurate measurement of path connectivity by Na_C. From Table 1, one can see that the PCNL of network B is larger than that of network A. This shows the effectiveness of the PCNL to measure the path connectivity of networks.

Next, we take the PCNL (Lij = 6) as an objective function to optimize network A by the degree-preserving rewiring algorithm [26], which can keep the degree distribution of the network unchanged after rewiring the network. Figure 3 shows the process of network optimization by degree-preserving rewiring. One can find that network B can be obtained from network A with the optimization of the PCNL. However, one cannot obtain this optimization result through Na_C. This shows that the performance of the PCNL for evaluating the path connectivity is better than that of Na_C.

For the PCNL, it is noted that the length setting Lij = K needs to be greater than the network diameter D; otherwise, the path information of node pairs that their distance is greater than K will be neglected. From equation (3), one can find that the complexity of the PCNL will increase with the increase of Lij. Considering the influence of shortest paths on network functionality [6, 1824] and the complexity of the PCNL, we set Lij = |pij| to calculate the shortest path connectivity of a network. Then, equations (5) and (6) are shown as follows:where is the length of the shortest paths and is the number of paths with length l between node i and node j. represents the shortest path connectivity ability between node i and node j. Ssp is the mean value of the of all node pairs. We call Ssp the path connectivity based on the number and length of shortest paths (SPCNL). Note that Ssp may not change monotonically as edges are added or deleted. The reason is that, when one adds or deletes an edge in a network, the number of shortest paths in the network may decrease or increase. Therefore, it is possible to reduce or improve the SPCNL of a network by adding or deleting an edge. One can also find similar examples, and many researches studies have also obtained similar results [27]. For example, when drivers choose the shortest path independently, opening some new road sections may lead to overall traffic network congestion and capacity decline. Although the complexity of SPCNL is much lower than that of PCNL, it contains the shortest path information between all node pairs. According to equations (5) and (6), the contribution of the shortest path to PCNL is relatively larger than that of other paths. Therefore, we can use SPCNL to evaluate the path connectivity of a network.

4. Relationship between Path Connectivity and Network Topologies

Some natural questions arise: what is the relation between path connectivity and network topologies? How can we obtain a network with high path connectivity under a given average degree or degree distribution? We will try to answer these questions in this section. We first generate BA networks (heterogeneous networks) [28] and ER networks (homogeneous networks) [29] with sizes of 1000, 2000, 3000, and 4000. For BA networks and ER networks, we generate ten networks of each size, respectively. All networks have the same average degree <k>≈6. Figure 4 shows the average value of each of the ten networks for BA networks and ER networks, respectively. From Figure 4(a), one can see that the SPCNLs of the BA networks are the larger that those of ER networks. Figures 4(c) and 4(d) show that this is because the P_nums of BA networks are greater than those of ER networks and the Avg_Ls of BA networks are smaller than those of ER networks. From Figure 4(c), under the same average degree, one can see that the BA networks can generate the many shortest paths than ER networks. In Figure 4(b), one can see that Na_Cs of BA networks are far greater than those of ER networks. One can infer that the reason is that the Avg_Ls of BA networks are smaller than those of ER networks (Figure 4(d)). From Figures 4(b) and 4(d), one can obtain that the effect of path length on Na_C is amplified. By the results shown in Figure 4, one can answer the questions at the beginning of this section, namely, under the same average degree condition, the heterogeneous networks have the larger number of shortest paths and the stronger path connectivity than homogeneous networks.

To further confirm the above conclusion, we need more networks with the same average degree and different network topologies. We use random edge rewiring for one of the ten BA networks to obtain a new network. Then, taking the new network as the initial network, we use the degree-preserving rewiring algorithm to generate three network sets and denote them as Ran networks (uncorrelated network), Dis networks (disassortative network), and Ass networks (assortative network), respectively. There are ten networks in each network set. The networks in the same network set have the same degree distribution and degree correlation coefficient. Note that all of the networks in the three sets have the identical average degree as the BA networks and ER networks. The characteristic parameters of the network sets are shown in Table 2, where <k2> is the mean of the sum of the squares of the degrees. <k2> can represent the heterogeneity of a network topology. For Ran networks, Dis networks, Ass networks, BA networks, and ER networks, each characteristic parameter in Table 2 is the average value of the characteristic parameters of the corresponding ten networks.

In Table 2, one can see that the <k2> and P_num of the BA network are both larger than those of the other networks. This shows that the heterogeneity significantly increases the number of shortest paths in a network. One can also obtain that the SPCNL of the BA network is larger than that of the other networks. This means that the shortest path connectivity and the function of communication of the BA network are better than those of the other networks. For Ran networks, Dis networks, and Ass networks with the identical <k2>, they have almost the same SPCNLs. Note that the SPCNL of the BA networks with the largest <k2> is significantly larger than the SPCNL of the ER networks with the smallest <k2>. The order of SPCNL is consistent with the order of <k2> among the Ran networks, Dis networks, Ass networks, BA networks, and ER networks. One can also see that the order of SPCNL is inconsistent with the order of r among all the networks. These mean that the SPCNL may be positively correlated with the heterogeneity of a network topology and independent of the degree correlation coefficient. One can see that the Tri_nums and NA_Cs of Ass networks and BA networks are far larger than those of the other networks, and this may suggest that NA_C is positively correlated with the Tri_nums in a network. This confirms the conclusion drawn in Section 2 that the Na_C has too strong a correlation with the short closed path lengths of the networks. Next, we will carry out edge-betweenness-based malicious attacks on these networks to further verify these conclusions in Section 5.

5. Simulations

In the actual situation, edges are more vulnerable than nodes in a network [30]. In particular, edges with high betweenness play an important role in the network path connectivity [31]. The larger the betweenness of an edge is, the greater the number of shortest paths between node pairs passing through the edge is. If the edges with high betweenness are attacked and removed from a network, the shortest paths of a network will change dramatically. To verify the above conclusion on the relation between network topology and path connectivity, we use edge-betweenness-based malicious attacks to study the SPCNL and Na_C of the above five types of networks and draw some conclusions. The process edge-betweenness-based malicious attack is as follows: (1) the edge betweenness of each network is calculated; (2) the edge with the maximal betweenness is removed from the network. We repeat the process 1000 times to remove 1000 edges one by one for each network. Each data point is the average of the ten networks for BA networks, ER networks, Ran networks, Dis networks, and Ass networks under the edge betweenness-based malicious attack.

Figure 5(a) shows Na_C as a function of the number of deleted edges, and Figure 5(b) shows Tri_num as a function of the number of deleted edges. From Figures 5(a) and 5(b), one can see that Na_C showed a significantly corresponding relationship and similar changing trend with Tri_num. One can see that the relative changing laws of the Na_C of Ass networks are consistent with the changing laws of the Tri_num of Ass networks in that the curves both decrease slowly with the increase of the number of deleted edges. For BA networks, as Tri_num drops rapidly, Na_C also exhibits a rapid decline. From Figures 5(a) and 5(b), we can obtain that NA_C is positively correlated with Tri_nums in a network. This validates the previous conclusion that the Na_C has an overly strong correlation with the short closed path lengths of the networks to limit the performance to evaluate the path connectivity.

For Ran networks, Dis networks, and Ass networks in Figures 5(d) and 5(e), the <k2> and P_num of the three types of networks have little difference with the increase of the number of deleted edges. One can also see that the SPCNL of the three types of networks have little difference. Note that the SPCNL of the Ass networks is slightly less than Ran networks and Dis networks. One can speculate the reason from the Figure 5(f) that the Avg_L of the Ass networks is still larger than Ran networks and Dis networks with the increase of the number of deleted edges. For BA networks, the <k2> decreases rapidly with the deletion of edges until it is close to that of the other networks. From Figure 5(c), one can see that the changing trend for the SPCNL of BA networks is the same as that of the <k2>. For ER networks, the <k2> is still smaller than those of the other networks in the Figure 5(d), and the same scenario for SPCNL can be seen in the Figure 5(c). One can see that the SPCNLs of all networks show an obviously corresponding relationship and similar changing trend with <k2> (see Figure 5(d)). In general, we can obtain that the SPCNL is positively correlated with the heterogeneity of a network topology.

6. Conclusions

The number and length of the shortest paths are important topological indexes for the functionality of complex networks. The greater the number and the shorter the length of paths in a network, the better the path connectivity of the network is. Considering the number and length of the shortest paths, a new measure called the PCNL has been proposed in this paper to assess network path connectivity. Compared with the classical natural connectivity Na_C, the effectiveness of the proposed measure has been verified. In view of the importance of the shortest paths, we further propose the SPCNL based on the number and length of shortest paths. We have studied the SPCNL for two types of networks, namely, the BA networks and ER networks. The results show that the BA networks have the larger number of shortest paths and the stronger path connectivity than ER networks with identical average degree. We have drawn the same conclusion with the two types of networks with different sizes. To explore the relationship between network topology and path connectivity, we have generated three types of networks with the same degree distribution but different degree correlations, namely, Ran networks, Dis networks, and Ass networks and carried out edge-betweenness-based malicious attacks on the above five types of networks to obtain various conclusions. In general, the results show that the NA_C is positively correlated with Tri_num and that the SPCNL is positively correlated with the heterogeneity of a network topology, which provide a new perspective to design complex networks with high path connectivity.

As we all know, the measures based on finding network paths are extremely complex. If one uses these measures as an objective function to optimize the network, the computational complexity will grow larger with the increasing scale of a network. Under the same degree distribution (keep network heterogeneity unchanged), increasing the number of shortest paths and limiting path length simultaneously can effectively increase SPCNL. However, it is a challenge to achieve this goal by existing optimization methods. Therefore, seeking an appropriate algorithm is an important study for optimizing network by using these measures as an objective function in the future.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research has been supported by the National Natural Science Foundation of China (Grant nos. 61672298, 61873326, 61802155, and 61802201) and the Philosophy Social Science Research Key Project Fund of Jiangsu University (Grant no. 2018SJZDI142).