Combinatorial approach to spreading processes on networks

Mazzilli, Dario; Radicchi, Filippo

doi:10.1140/epjb/s10051-020-00029-z

Combinatorial approach to spreading processes on networks

Regular Article - Statistical and Nonlinear Physics
Published: 11 January 2021

Volume 94, article number 15, (2021)
Cite this article

The European Physical Journal B Aims and scope Submit manuscript

Dario Mazzilli¹ &
Filippo Radicchi¹

210 Accesses
4 Altmetric
Explore all metrics

Abstract

Stochastic spreading models defined on complex network topologies are used to mimic the diffusion of diseases, information, and opinions in real-world systems. Existing theoretical approaches to the characterization of the models in terms of microscopic configurations rely on some approximation of independence among dynamical variables, thus introducing a systematic bias in the prediction of the ground-truth dynamics. Here, we develop a combinatorial framework based on the approximation that spreading may occur only along the shortest paths connecting pairs of nodes. The approximation overestimates dynamical correlations among node states and leads to biased predictions. Systematic bias is, however, pointing in the opposite direction of existing approximations. We show that the combination of the two biased approaches generates predictions of the ground-truth dynamics that are more accurate than the ones given by the two approximations if used in isolation. We further take advantage of the combinatorial approximation to characterize theoretical properties of some inference problems, and show that the reconstruction of microscopic configurations is very sensitive to both the place where and the time when partial knowledge of the system is acquired.

Graphic Abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Emergence in complex networks of simple agents

Article Open access 23 May 2023

Conservative and Semiconservative Random Walks: Recurrence and Transience

Article 27 February 2017

Complex Networks: a Mini-review

Article 13 July 2020

Data Availability Statement

This manuscript has no associated data or the data will not be deposited. [Authors’ comment: Data of the real-world network considered in this paper can be found in Ref. [37].].

References

R. Pastor-Satorras, C. Castellano, P. Van Mieghem, A. Vespignani, Rev. Mod. Phys. 87, 925 (2015)
Article ADS Google Scholar
C .T. Butts, Science 325, 414 (2009)
Article ADS MathSciNet Google Scholar
M.O. Jackson, Social and Economic Networks (Princeton University Press, Princeton, 2010)
Book Google Scholar
A. Vespignani, Nat. Phys. 8, 32 (2012)
Article Google Scholar
A.L. Lloyd, R.M. May, Science 292, 1316 (2001)
Article Google Scholar
K.T. Eames, M.J. Keeling, Proc. Natl. Acad. Sci. 99, 13330 (2002)
Article ADS Google Scholar
L. Weng, F. Menczer, Y.-Y. Ahn, In:Eighth international AAAI conference on weblogs and social media, (2014)
C. Castellano, S. Fortunato, V. Loreto, Rev. Mod. Phys. 81, 591 (2009)
Article ADS Google Scholar
Y. Moreno, M. Nekovee, A.F. Pacheco, Phys. Rev. E 69, 066130 (2004)
Article ADS Google Scholar
L. Dall’Asta, A. Baronchelli, A. Barrat, V. Loreto, Phys. Rev. E 74, 036105 (2006)
Article ADS Google Scholar
G. Brandi, R. Di Clemente, G. Cimini, Phys. A Stat. Mech. Appl. 507, 255 (2018)
Article Google Scholar
I. Dobson, B.A. Carreras, D.E. Newman, J.M. Reynolds-Barredo, IEEE Trans. Power Syst. 31, 4831 (2016)
Article ADS Google Scholar
C.A. Hidalgo, B. Klinger, A.-L. Barabási, R. Hausmann, Science 317, 482 (2007)
Article ADS Google Scholar
T.P. Vogels, K. Rajan, L.F. Abbott, Annu. Rev. Neurosci. 28, 357 (2005)
Article Google Scholar
Y. Moreno, R. Pastor-Satorras, A. Vespignani, Eur. Phys. J. B Condens. Matter Complex Syst. 26, 521 (2002)
Google Scholar
J.L. Payne, K.D. Harris, P.S. Dodds, Phys. Rev. E 84, 016110 (2011)
Article ADS Google Scholar
C. Castellano, R. Pastor-Satorras, Phys. Rev. Lett. 105, 218701 (2010)
Article ADS Google Scholar
L. Buzna, K. Peters, D. Helbing, Phys. A Stat. Mech. Appl. 363, 132 (2006)
Article Google Scholar
F. Altarelli, A. Braunstein, L. Dall’Asta, A. Lage-Castellanos, R. Zecchina, Phys. Rev. Lett. 112, 118701 (2014)
A.Y. Lokhov, M. Mézard, H. Ohta, L. Zdeborová, Phys. Rev. E 90, 012801 (2014)
Article ADS Google Scholar
F. Radicchi, C. Castellano, Phys. Rev. Lett. 120, 198301 (2018)
D. Kempe, J. Kleinberg, É. Tardos, in Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. (2003), pp. 137–146
Y. Wang, D. Chakrabarti, C. Wang, C. Faloutsos, in Proceedings of 22nd International Symposium on Reliable Distributed Systems, 2003, (IEEE, 2003), pp. 25–34
D. Chakrabarti, Y. Wang, C. Wang, J. Leskovec, C. Faloutsos, ACM Transactions on Information and System Security 10, 1 (2008). https://doi.org/10.1145/1284680.1284681. ISSN 1094-9224
B. Karrer, M.E. Newman, Phys. Rev. E 82, 016101 (2010)
Article ADS MathSciNet Google Scholar
A.Y. Lokhov, M. Mézard, L. Zdeborová, Phys. Rev. E 91, 012811 (2015)
Article ADS Google Scholar
E. Cator, P. Van Mieghem, Phys. Rev. E 89, 052802 (2014)
Article ADS Google Scholar
J.P. Gleeson, Phys. Rev. X 3, 021004 (2013)
Google Scholar
D. Brockmann, D. Helbing, Science 342, 1337 (2013)
Article ADS Google Scholar
M.E. Newman, Phys. Rev. E 66, 016128 (2002)
Article ADS MathSciNet Google Scholar
R.M. Anderson, B. Anderson, R.M. May, Infectious Diseases of Humans: Dynamics and Control (Oxford University Press, Oxford, 1992)
Google Scholar
J.P. Gleeson, Phys. Rev. Lett. 107, 068701 (2011)
Article ADS Google Scholar
K.E. Hamilton, L.P. Pryadko, Phys. Rev. Lett. 113, 208701 (2014)
Article ADS Google Scholar
B. Karrer, M.E. Newman, L. Zdeborová, Phys. Rev. Lett. 113, 208702 (2014)
Article ADS Google Scholar
F. Radicchi, Nat. Phys. 11, 597 (2015)
Article Google Scholar
F. Radicchi, C. Castellano, Nat. Commun. 6, 1 (2015)
Article Google Scholar
V. Colizza, R. Pastor-Satorras, A. Vespignani, Nat. Phys. 3, 276 (2007)
Article Google Scholar
H. Prüfer, Arch. Math. Phys 27, 742 (1918)
Google Scholar
S. Pemmaraju, S. Skiena, Computational Discrete Mathematics: Combinatorics and Graph Theory with Mathematica® (Cambridge University Press, Cambridge, 2003)
Book Google Scholar
D. Shah, T. Zaman, in Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, (2010), pp. 203–214
D. Shah, T. Zaman, IEEE Trans. Inf. Theory 57, 5163 (2011)
Article Google Scholar
W. Luo, W.P. Tay, M. Leng, IEEE Trans. Signal Process. 61, 2850 (2013)
Article ADS MathSciNet Google Scholar
K. Zhu, L. Ying, IEEE/ACM Trans. Netw. 24, 408 (2014)
Article Google Scholar

Download references

Acknowledgements

DM and FR acknowledge support from the US Army Research Office (W911NF-16-1- 0104). FR acknowledges support from the National Science Foundation (CMMI-1552487).

Author information

Authors and Affiliations

Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, 47408, USA
Dario Mazzilli & Filippo Radicchi

Authors

Dario Mazzilli
View author publications
You can also search for this author in PubMed Google Scholar
Filippo Radicchi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Filippo Radicchi.

Appendices

Appendix A: Magnitude of the error associated with the shortest-path combinatorial approximation

In Figs. 2 and 5, we considered an hypothetical setting where the generic node i is connected to the source node s by two independent paths of length $\ell _{si}$ and $\ell _{si} + d \ell $, with $d \ell \ge 0$. The paths are independent in the sense that they do not share any node except for s and i. This fact allows us to easily compute the exact probabilities for the ground-truth scenario by simply combining the probabilities of the individual paths. The setting is useful to understand the magnitude of the error that we should expect to have when using SPCA in a non-tree network, where multiple paths among nodes may exist. For simplicity of notation, but without loss of generality, we will use $\ell = \ell _{si}$ in the following description.

1.1 Susceptible-infected model

For the SI model, the probability that the infection reaches a certain node along a path of length $\ell $ in t time steps or less is given by

$$\begin{aligned} q_{1} (\ell , t) = \sum _{r=0}^{t} \, \left( {\begin{array}{c}t-1\\ \ell -1\end{array}}\right) \, \beta ^{\ell } \, (1-\beta )^{t-\ell }, \end{aligned}$$

The previous expression is nothing more than a mere combination of Eqs. (2) and (7) of the main text. We just avoided to write an explicit dependence on the source and target nodes to simplify the expression. In presence of two independent paths, the probability that the infection reaches the target node is given by

$$\begin{aligned} q_{2}(\ell , \ell + d \ell , t) = 1 - [1 - q_{1} (\ell , t)] [1 - q_{1} (\ell + d \ell , t)], \end{aligned}$$

thus equal to the probability that spreading occurs at least on one of the two independent paths. The relative error of Fig. 2 is finally quantified as

$$\begin{aligned} \epsilon (\ell , d \ell , t) = 1 - \frac{q_{1}(\ell , t)}{q_{2}(\ell , d \ell , t)}. \end{aligned}$$

1.2 Susceptible-infected-recovered model

For the SIR model, the calculation is a bit more cumbersome than for the SI model.

Suppose node s is initially in the infected state, and suppose that two independent paths of length $\ell $ and $\ell + d \ell $ connect node i to node s. The probability $q_{2}(\ell , \ell + d \ell , t)$ that node i becomes infected at time t is given by the probability that the infection spreads along at least one of these paths. We remark that we know the analytical form of the probability $q_{1}(\ell , t)$ that the infection spreads along a single path of length $\ell $ in t time steps or less, see main text. However, this expression can be used to combine the contribution of the two independent paths only provided that the paths are dynamically independent. The latter condition is satisfied only when the infection performs at least one step towards the target along at least one of the paths.

Indicate with v the neighbor of node s along the path of length $\ell $ towards i, and with w the neighbor of node s along the path of length $\ell + d \ell $ towards i. The initial configuration at time $t=0$ is such that $\sigma _s^{(0)} = I$ and $\sigma _{\forall j \ne s}^{(0)} = S$. At time $t=1$, the states of nodes may change as the results of spreading and recovery events. The only nodes that can change their states are s, v and w. For example, we can go to the configuration $\mathbf {\sigma }^{(1)} = (I, I, S, \ldots )$, i.e., such that $\sigma _v^{(1)} = I$, $\sigma _w^{(1)} = S$ and $\sigma _s^{(1)} = I$, with probability $\text {Prob.}[\mathbf {\sigma }^{(1)} = (\sigma _v^{(1)} = I, \sigma _w^{(1)} = S, \sigma _v^{(1)} = I, S, \ldots , S) ] = \beta (1-\beta ) (1- \gamma )$. After this first step, the spreading of the infection will happen independently along the two paths, thus we can write $q_2[\ell , \ell + d \ell , t | \mathbf {\sigma }^{(1)} = (\sigma _v^{(1)} = I, \sigma _w^{(1)} = S, \sigma _v^{(1)} = I, S, \ldots , S) ] = 1 - [1-q_1(\ell -1, t-1)] [1-q_1(\ell + d \ell , t-1)]$. There are in total eight of such configurations. They are listed in Table 1. In general, we can write that

$$\begin{aligned} q_2 (\ell , \ell + d \ell , t) = \sum _{\mathbf {\sigma }} \, q_2(\ell , \ell + d \ell , t | \mathbf {\sigma }) \, \text {Prob.}(\mathbf {\sigma }), \end{aligned}$$

(A1)

where the sum runs over all eight configurations $\mathbf {\sigma }$ of Table 1. The expressions of the probabilities appearing in Table 1 are then used to solve Eq. (A1) by iteration, starting from the initial condition $q_2(\ell , \ell + d \ell , t =0 ) = 0$.

Appendix B: Joint probability of infection from a single source

1.1 Susceptible-infected model

Here, we illustrate how to compute the joint probability $Q^{(t)}_{s\rightarrow i,j}$ that nodes i and j are infected at time t or earlier given that the source of spreading is node s. The computation still takes advantage of Eqs. (2) and (7), by properly accounting for the position of the source node s relatively to the positions of the target nodes i and j (see Fig. 12).

If node j is seating in between nodes s and j, then the infection can reach node i only passing first through node j. Thus, we can safely write that $Q^{(t)}_{s \rightarrow i,j} = Q^{(t)}_{s \rightarrow i}$. The same exact argument leads us to write $Q^{(t)}_{s \rightarrow i,j} = Q^{(t)}_{s \rightarrow j}$ if node i is seating in between nodes j and s.

A less straightforward computation is required when the source node s is connected to nodes i and j with partially independent paths. Part of the spreading path can be in common among the two trajectories, say up to node k as indicated in Fig. 12. However after this node, the two paths are dynamically independent one on the other and the two contributions are computed separately. Specifically, we can write

$$\begin{aligned} Q^{(t)}_{s\rightarrow i,j} = \sum _{r=0}^{t-\max (\ell _{ki},\ell _{kj})} P^{(r)}_{s \rightarrow k} \, Q^{(t-r)}_{k\rightarrow i} \, Q^{(t-r)}_{k\rightarrow j} \; , \end{aligned}$$

(B1)

where $P^{(r)}_{s \rightarrow k}$ is the usual probability that the infection reached node k in exactly r stages of the dynamics. The sum on the r.h.s. of Eq. (B1) runs over all possible values of r compatible with the quantity that we want to estimate.

1.2 Susceptible-infected-recovered model

In the SIR model we can compute $Q^{(t)}_{s\rightarrow i,j}$ using the very same method for SI with the only caveat to take into account Eq. (A1) and Table 1 whenever the source is between i and j or the two shortest paths become independent.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mazzilli, D., Radicchi, F. Combinatorial approach to spreading processes on networks. Eur. Phys. J. B 94, 15 (2021). https://doi.org/10.1140/epjb/s10051-020-00029-z

Download citation

Received: 17 August 2020
Accepted: 30 November 2020
Published: 11 January 2021
DOI: https://doi.org/10.1140/epjb/s10051-020-00029-z

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions