Skip to main content
Log in

Directed hybrid random networks mixing preferential attachment with uniform attachment mechanisms

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

Motivated by the complexity of network data, we propose a directed hybrid random network that mixes preferential attachment (PA) rules with uniform attachment rules. When a new edge is created, with probability \(p\in (0,1)\), it follows the PA rule. Otherwise, this new edge is added between two uniformly chosen nodes. Such mixture makes the in- and out-degrees of a fixed node grow at a slower rate, compared to the pure PA case, thus leading to lighter distributional tails. For estimation and inference, we develop two numerical methods which are applied to both synthetic and real network data. We see that with extra flexibility given by the parameter p, the hybrid random network provides a better fit to real-world scenarios, where lighter tails from in- and out-degrees are observed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Alves, C., Ribeiro, R., Sanchis, R. (2019). Preferential attachment random graphs with edge-step functions. Journal of Theoretical Probability, 34(1), 438–476.

    Article  MathSciNet  Google Scholar 

  • Atalay, E., Hortaçsu, A., Roberts, J., Syverson, C. (2011). Network structure of production. Proceedings of the National Academy of Sciences of the United States of America, 108(13), 5199–5202.

    Article  Google Scholar 

  • Barabási, A.-L., Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512.

    Article  MathSciNet  Google Scholar 

  • Chen, M.-H., Shao, Q.-M., Ibrahim, J. G. (2010). Monte Carlo Methods in Bayesian Computation. New York, NY: Springer-Verlag.

    MATH  Google Scholar 

  • Cooper, C., Frieze, A. (2003). A general model of web graphs. Random Structures and Algorithms, 22(3), 311–335.

    Article  MathSciNet  Google Scholar 

  • Csardi, G., Nepusz, T. (2006). The igraph software package for complex network research. InterJournal Complex Systems, 1695.

  • Deijfen, M., van den Esker, H., van der Hofstad, R., Hooghiemstra, G. (2009). A preferential attachment model with random initial degrees. Arkiv för Matematik, 47(1), 41–72.

    Article  MathSciNet  Google Scholar 

  • Deijfen, M., van den Esker, H., van der Hofstad, R., Hooghiemstra, G. (2020). A preferential attachment model with random initial degrees. https://arxiv.org/pdf/0705.4151.pdf

  • de Sollar Price, D. J. (1965). Networks of scientific papers. Science, 149(3683), 510–515.

    Article  Google Scholar 

  • Durrett, R. T. (2006). Random Graph Dynamics. Cambridge, U.K.: Cambridge University Press.

    Book  Google Scholar 

  • Durrett, R. T. (2019). Probability: Theory and Examples (5 ed.). Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge, U.K.: Cambridge University Press.

  • Gao, F., van der Vaart, A. (2017). On the asymptotic normality of estimating the affine preferential attachment network models with random initial degrees. Stochastic Processes and their Applications, 127(11), 3754–3775.

    Article  MathSciNet  Google Scholar 

  • Gelman, A., Carlin, J. B., Dunson, D. B., Behtari, A., Rubin, D. B. (2013). Bayesian Data Analysis. Boca Raton, FL, U.S.A.: Chapman and Hall/CRC.

    Book  Google Scholar 

  • Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57(1), 97–109.

    Article  MathSciNet  Google Scholar 

  • Henzinger, M., Lawrence, S. (2004). Extracting knowledge from the World Wide Web. Proceedings of the National Academy of Sciences of the United States of America, 101(supplement 1), 5186–5191.

    Article  Google Scholar 

  • Hunter, D. R., Goodreau, S. M., Handcock, M. S. (2008). Goodness of fit of social network models. Journal of the American Statistical Association, 103(481), 248–258.

    Article  MathSciNet  Google Scholar 

  • Lagarias, J. C., Reeds, J. A., Wright, M. H., Wright, P. E. (1998). Convergence properties of the Nelder-Mead simplex method in low dimensions. SIAM Journal on Optimization, 9(1), 112–147.

    Article  MathSciNet  Google Scholar 

  • Liang, F., Liu, C., Carroll, R. J. (2010). Advanced Markov Chain Monte Carlo Methods: Learning from Past Examples. Hoboken, NJ, U.S.A.: Wiley.

    Book  Google Scholar 

  • Mahmoud, H. M. (2019). Local and global degree profiles of randomly grown self-similar hooking networks under uniform and preferential attachment. Advances in Applied Mathematics, 111, 101930.

    Article  MathSciNet  Google Scholar 

  • Medina, J. A., Finke, J., Rocha, C. (2019). Estimating formation mechanisms and degree distributions in mixed attachment networks. Journal of Physica A: Mathematical and Theoretical, 52, 095001.

    MathSciNet  Google Scholar 

  • Mengersen, K. L., Tweedie, R. L. (1996). Rates of convergence of the Hastings and Metropolis algorithms. Annals of Statistics, 24(1), 101–121.

    Article  MathSciNet  Google Scholar 

  • Merton, R. K. (1968). The Matthew effect in science. Science, 159(3810), 56–63.

    Article  Google Scholar 

  • Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. (1953). Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21, 1087.

    Article  Google Scholar 

  • Nash, J. C. (2014). On best practice optimization methods in R. Journal of Statistical Software, 60(2), 1–14.

    Article  Google Scholar 

  • Nelder, J. A., Mead, R. (1965). A simple method for function minimization. The Computer Journal, 7(4), 308–313.

    Article  MathSciNet  Google Scholar 

  • Newman, M. E. J. (2001). Clustering and preferential attachment in growing networks. Physical Review E, 65(1), 025102.

    Article  Google Scholar 

  • Pachon, A., Sacerdote, L., Yang, S. (2018). Scale-free behavior of networks with the copresence of preferntial and uniform attachment rules. Physica D: Nonliner Phenomena, 371, 1–12.

    Article  Google Scholar 

  • Samorodnitsky, G., Resnick, S., Towsley, D., Davis, R., Willis, A., Wan, P. (2016). Nonstandard regular variation of in-degree and out-degree in the preferential attachment model. Journal of Applied Probability, 53(1), 146–161.

    Article  MathSciNet  Google Scholar 

  • Shao, Z.-G., Zou, X.-W., Jin, Z.-Z. (2006). Growing networks withmixed attachment mechanisms. Journal of Physics A: Mathematical and General, 39, 9.

    Article  MathSciNet  Google Scholar 

  • Smith, B. J. (2007). boa: An R package for MCMC output convergence assessment and posterior inference. Journal of Sstatistical Software, 21(11), 1–37.

    Google Scholar 

  • van der Hofstad, R. (2017). Random Graphs and Complex Networks. Cambridge, U.K.: Cambridge University Press.

    MATH  Google Scholar 

  • Viswanath, B., Mislove, A., Cha, M., Gummadi, K.P. (2009, August). On the evolution of user interaction in Facebook. In J. Crowcroft, & B. Krishnamurthy (Eds.), Proceedings of the 2nd ACM Workshop on Online Social Networks (WOSN’09), New York, NY, U.S.A. (pp. 37–42). Association for Computing Machinery.

  • Wan, P., Wang, T., Davis, R. A., Resnick, S. I. (2017). Fitting the linear preferential attachment model. Electronic Journal of Statistics, 11(2), 3738–3780.

    Article  MathSciNet  Google Scholar 

  • Wang, T., Resnick, S. (2018). Multivariate regular variation of discrete mass functions with applications to preferential attachment networks. Methodology and Computing in Applied Probability, 20(3), 1029–1042.

    Article  MathSciNet  Google Scholar 

  • Wang, T., Resnick, S. (2020). Degree growth rates and index estimation in a directed preferential attachment model. Stochastic Processes and their Applications, 130(2), 878–906.

    Article  MathSciNet  Google Scholar 

  • Wang, T., Resnick, S. I. (2015). Asymptotic normality of in- and out-degree counts in a preferential attachment model. Stochastic Models, 33(2), 229–255.

    Article  MathSciNet  Google Scholar 

  • Wang, T., Resnick, S. I. (2020). A directed preferential attachment model with Poisson measurement. https://arxiv.org/pdf/2008.07005.pdf.

  • Wang, T., Resnick, S. I. (2021). Common growth patterns for regional social networks: A point process approach. Journal of Data Science. https://doi.org/10.6339/21-JDS1021.

    Article  Google Scholar 

  • Zhang, P., Mahmoud, H. M. (2020). On nodes of small degrees and degree profile in preferential dynamic attachment circuits. Methodology and Computing in Applied Probability, 22(2), 625–645.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We would like to thank two anonymous referees and the handling AE for constructive reports that help improve the quality of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tiandong Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 220 KB)

Appendices

A. Proof of Theorem 2

Analogous to the previous proofs, we present the major steps of the proof for in-degree. To show the convergence of \(\frac{N^\text {in}_m(n)}{n}\), we take two steps. The first is to prove the concentration of \(\frac{N^\text {in}_m(n)}{n}\) around \(\mathbb {E}\left( N^\text {in}_m(n)\right) /{n}\), then it suffices to find the asymptotic limit of \(\mathbb {E}\left( N^\text {in}_m(n)\right) /{n}\).

Note that when \(\beta =0\), then the number of nodes in graph G(n) is deterministic, so the concentration results in van der Hofstad (2017), Proposition 8.4 are applicable, and we have for \(C>2\sqrt{2}\),

$$\begin{aligned} {\mathbb {P}}\left( \left| N^{\mathrm{in}}_m(n)-\mathbb {E}(N^{\mathrm{in}}_m(n))\right| \ge C\sqrt{n\log n}\right) = o(1/n). \end{aligned}$$

When \(\beta >0\), the total number of nodes in graph G(n) is random, and detailed proofs are needed. We claim that for \(\beta >0\), there exists some constant \(C>2\sqrt{2}\) such that.

$$\begin{aligned} {\mathbb {P}}\left( \left| N^{\mathrm{in}}_m(n)-\mathbb {E}(N^{\mathrm{in}}_m(n))\right| \ge C\sqrt{n\log n}(1+\log n)\right) = o(1/n). \end{aligned}$$
(12)

The proof of (12) relies on rewriting \(N^{\mathrm{in}}_m(n)-\mathbb {E}(N^{\mathrm{in}}_m(n))\) in terms of a Doob’s martingale, similar to the argument in the corrected version of Deijfen et al. (2009) (available at https://arxiv.org/pdf/0705.4151.pdf, and cited as Deijfen et al. (2020). But here since the number of nodes created at each step is random, we need to modify the proof machinery outlined in Deijfen et al. (2020). Recall the notation in Sect. 2 that \(\{J_n:n\ge 1\}\) is a sequence of iid tri-nomial random variable on \(\{1,2,3\}\) with cell probability \(\alpha \), \(\beta \) and \(\gamma \), respectively. Write \(\{J_k:1\le k\le n\}=:J_{[n]}\), and for \(1\le t\le n\), define

$$\begin{aligned} Z_t := \mathbb {E}\left[ N^{\mathrm{in}}_m(n)\, \Big {|} \,\mathcal {F}_t, J_{[t]}\right] , \end{aligned}$$

and \(Z_0 = \mathbb {E}\left[ N^{\mathrm{in}}_m(n)\right] \). Then

$$\begin{aligned} N^{\mathrm{in}}_m(n)-\mathbb {E}(N^{\mathrm{in}}_m(n)) = Z_n-Z_0, \end{aligned}$$

and \(\{Z_n:n\ge 0\}\) is a martingale with \(\mathbb {E}(|Z_t|) = \mathbb {E}(N^{\mathrm{in}}_m(n)) \le n\).

Then consider

$$\begin{aligned} Z_t-Z_{t-1}&= \mathbb {E}\left[ N^{\mathrm{in}}_m(n)\, \Big {|} \,\mathcal {F}_t, J_{[t]}\right] -\mathbb {E}\left[ N^{\mathrm{in}}_m(n)\, \Big {|} \,\mathcal {F}_{t-1}, J_{[t-1]}\right] \\&= \mathbb {E}\left[ N^{\mathrm{in}}_m(n)\, \Big {|} \,\mathcal {F}_t, J_{[t]}\right] - \mathbb {E}\left[ N^{\mathrm{in}}_m(n)\, \Big {|} \,\mathcal {F}_{t-1}, J_{[t]}\right] \\&\quad + \mathbb {E}\left[ N^{\mathrm{in}}_m(n)\, \Big {|} \,\mathcal {F}_{t-1}, J_{[t]}\right] -\mathbb {E}\left[ N^{\mathrm{in}}_m(n)\, \Big {|} \,\mathcal {F}_{t-1}, J_{[t-1]}\right] \\&\quad =: I+II. \end{aligned}$$

For I, the only change in the conditioning is the extra information contained in G(t), which, compared with that in \(G(t-1)\), specifies how the edge created at the t-th -step is constructed. This has the potential to affect the in-degrees of at most 2 nodes, thus leading to \(|I|\le 2\).

For the second term, II, we define \(\bar{J}_t\) to be an independent copy of \(J_t\), which is also independent from \(J_{[n]}\). Write \(\bar{J}_{[n]} :=\{J_1,\ldots , J_{t-1},\bar{J}_t, J_{t+1},\ldots , J_n\}\). Let \(\bar{N}^\mathrm{in}_m(n)\) and \(\bar{D}^\mathrm{in}_v(n)\) be the number of nodes with in-degree m, and the in-degree of node v in the hybrid PA graph, \(\bar{G}(n)=(\bar{V}_n, \bar{E}_n)\), constructed from \(\bar{J}_{[n]}\), respectively. Then we have

$$\begin{aligned} II&= \mathbb {E}\left[ N^{\mathrm{in}}_m(n)\, \Big {|} \,\mathcal {F}_{t-1}, J_{[t]}\right] -\mathbb {E}\left[ \bar{N}^\mathrm{in}_m(n)\, \Big {|} \,\mathcal {F}_{t-1}, J_{[t-1]}\right] \\&= \mathbb {E}\left[ \mathbb {E}\left[ N^{\mathrm{in}}_m(n)\, \Big {|} \,\mathcal {F}_{t-1}, J_{[n]}\right] \, \Big {|} \,\mathcal {F}_{t-1}, J_{[t]}\right] -\mathbb {E}\left[ \bar{N}^\mathrm{in}_m(n)\, \Big {|} \,\mathcal {F}_{t-1}, J_{[t]}\right] \\&= \mathbb {E}\left\{ \mathbb {E}\left[ N^{\mathrm{in}}_m(n)\, \Big {|} \,\mathcal {F}_{t-1}, J_{[n]}\right] -\mathbb {E}\left[ \bar{N}^\mathrm{in}_m(n)\, \Big {|} \,\mathcal {F}_{t-1}, \bar{J}_{[n]}\right] \, \Big {|} \,\mathcal {F}_{t-1}, J_{[t]}\right\} \\&= \mathbb {E}\left\{ \mathbb {E}\left[ N^{\mathrm{in}}_m(n)\, \Big {|} \,\mathcal {F}_{t-1}, J_{[n]},\bar{J}_{[n]}\right] -\mathbb {E}\left[ \bar{N}^\mathrm{in}_m(n)\, \Big {|} \,\mathcal {F}_{t-1}, {J}_{[n]},\bar{J}_{[n]}\right] \, \Big {|} \,\mathcal {F}_{t-1}, J_{[t]}\right\} . \end{aligned}$$

Therefore, it suffices to consider

$$\begin{aligned}&\left| \mathbb {E}\left[ N^{\mathrm{in}}_m(n)\, \Big {|} \,\mathcal {F}_{t-1}, J_{[n]},\bar{J}_{[n]}\right] -\mathbb {E}\left[ \bar{N}^\mathrm{in}_m(n)\, \Big {|} \,\mathcal {F}_{t-1}, {J}_{[n]},\bar{J}_{[n]}\right] \right| \nonumber \\&\le \left| \sum _{v=1}^{|V_n|}{\mathbb {P}}\left[ D^{\mathrm{in}}_v(n)=m\, \Big {|} \,\mathcal {F}_{t-1}, J_{[n]},\bar{J}_{[n]}\right] -\sum _{v=1}^{|\bar{V}_n|}{\mathbb {P}}\left[ \bar{D}^\mathrm{in}_v(n)=m\, \Big {|} \,\mathcal {F}_{t-1}, {J}_{[n]},\bar{J}_{[n]}\right] \right| \end{aligned}$$
(13)

where potential differences will occur only if \(J_n\ne \bar{J}_n\).

We start by assuming that \(J_n,\bar{J}_n\in \{1,3\}\), i.e. the total numbers of nodes in the two graphs remain unchanged. Then the quantity in (13) is bounded above by

$$\begin{aligned} \sum _{v=1}^{|V_n|}&\left| {\mathbb {P}}\left[ D^{\mathrm{in}}_v(n)=m\, \Big {|} \,\mathcal {F}_{t-1}, J_{[n]},\bar{J}_{[n]}\right] -{\mathbb {P}}\left[ \bar{D}^\mathrm{in}_v(n)=m\, \Big {|} \,\mathcal {F}_{t-1}, {J}_{[n]},\bar{J}_{[n]}\right] \right| \\&\le \sum _{v=1}^{|V_n|}\mathbb {E}\left[ \varvec{1}_{\{D^{\mathrm{in}}_v(n)\ne \bar{D}^\mathrm{in}_v(n)\}}\, \Big {|} \,\mathcal {F}_{t-1}, {J}_{[n]},\bar{J}_{[n]}\right] . \end{aligned}$$

When we either have \(J_t=1, \bar{J}_t=3\) or \(J_t=3, \bar{J}_t=1\), there are at most 2 nodes whose in-degrees will be different. Therefore, when \(J_t,\bar{J}_t\in \{1,3\}\),

$$\begin{aligned} \left| \mathbb {E}\left[ N^{\mathrm{in}}_m(n)\, \Big {|} \,\mathcal {F}_{t-1}, J_{[n]},\bar{J}_{[n]}\right] -\mathbb {E}\left[ \bar{N}^\mathrm{in}_m(n)\, \Big {|} \,\mathcal {F}_{t-1}, {J}_{[n]},\bar{J}_{[n]}\right] \right| \le 2. \end{aligned}$$

If \((J_t,\bar{J}_t)\in \{(2,1), (2,3), (1,2), (3,2)\}\), then \(\bigl ||V_s|-|\bar{V}_s|\bigr |=1\), for all \(s\ge n\), and we need to consider nodes created before and after step t separately. In particular, the difference in the total number of nodes will also lead to different attachment probabilities in the two graphs. Without loss of generality, we assume \(J_t=2\) and \(\bar{J}_t\ne 2\). For comparison purpose, we will relabel the extra node added at the t-th step as \(t'\), and keep the labeling of the other nodes identical in the two graphs. Then the quantity in (13) is bounded above by

$$\begin{aligned} 1+\sum _{v=1}^{|V_n|}&\left| {\mathbb {P}}\left[ D^{\mathrm{in}}_v(n)=m\, \Big {|} \,\mathcal {F}_{t-1}, J_{[n]},\bar{J}_{[n]}\right] -{\mathbb {P}}\left[ \bar{D}^\mathrm{in}_v(n)=m\, \Big {|} \,\mathcal {F}_{t-1}, {J}_{[n]},\bar{J}_{[n]}\right] \right| . \end{aligned}$$

Let N be the first time after t that a new node is created, i.e. \(N:=\inf \{k\ge t+1: J_k\ne 2\}\). Note that for \(s\in \{t+1,\ldots N\}\), every edge that is added at step s and pointing to the node \(t'\) will lead to a potential difference in the in-degree of nodes in \(V_{t-1}\). Hence, apart from node \(t'\), there are at most \(N-t-1\) number of nodes in \(V_{t-1}\) having different in-degrees in the two graphs. If no edge between step \(t+1\) and step N has been pointing to the node \(t'\), then possible differences in the in-degree of one particular node may occur due to the change in the attachment probabilities. This is also the case for those nodes added at N and afterward. To deal with different in-degrees due to changes in the attachment probabilities, we will apply a similar treatment as given in (Deijfen et al. 2020, Eq. (2.17)).

We now rewrite

$$\begin{aligned}&{\mathbb {P}}\left[ D^{\mathrm{in}}_v(n)=m\, \Big {|} \,\mathcal {F}_{t-1}, J_{[n]},\bar{J}_{[n]}\right] -{\mathbb {P}}\left[ \bar{D}^\mathrm{in}_v(n)=m\, \Big {|} \,\mathcal {F}_{t-1}, {J}_{[n]},\bar{J}_{[n]}\right] \\&= {\mathbb {P}}\left[ D^{\mathrm{in}}_v(n)=m, \bar{D}^\mathrm{in}_v(n)\ne m\, \Big {|} \,\mathcal {F}_{t-1}, J_{[n]},\bar{J}_{[n]}\right] \\&\qquad -{\mathbb {P}}\left[ D^{\mathrm{in}}_v(n)\ne m, \bar{D}^\mathrm{in}_v(n)=m\, \Big {|} \,\mathcal {F}_{t-1}, {J}_{[n]},\bar{J}_{[n]}\right] . \end{aligned}$$

Therefore, at least one of the attachments to node \(v\in V_n\) needs to have been made for one of the graphs but not the other. Let s denote the first time where such an attachment was made differently in the two graphs. Then we have \(D^{\mathrm{in}}_v(s-1)=\bar{D}^\mathrm{in}_v(s-1)\le m\). Hence,

$$\begin{aligned}&{\mathbb {P}}\left[ D^{\mathrm{in}}_v(n)=m, \bar{D}^\mathrm{in}_v(n)\ne m\, \Big {|} \,\mathcal {F}_{t-1}, J_{[n]},\bar{J}_{[n]}\right] \nonumber \\&\le \sum _{s=t+1}^n\mathbb {E}\left[ p(D^{\mathrm{in}}_v(s-1)+\delta _{\mathrm{in}})\left| \frac{1}{s+\delta _{\mathrm{in}}|V_{s-1}|}-\frac{1}{s+\delta _{\mathrm{in}}|\bar{V}_{s-1}|}\right| \, \Big {|} \,\mathcal {F}_{t-1}, {J}_{[n]},\bar{J}_{[n]}\right] \nonumber \\&\quad + \sum _{s=t+1}^n (1-p)\left| \frac{1}{|V_{s-1}|}-\frac{1}{|\bar{V}_{s-1}|}\right| \nonumber \\&= \sum _{s=t+1}^n\mathbb {E}\left[ \frac{p\delta _{\mathrm{in}}(D^{\mathrm{in}}_v(s-1)+\delta _{\mathrm{in}})}{(s+\delta _{\mathrm{in}}|V_{s-1}|)(s+\delta _{\mathrm{in}}|\bar{V}_{s-1}|)} \, \Big {|} \,\mathcal {F}_{t-1}, {J}_{[n]},\bar{J}_{[n]}\right] \nonumber \\&\quad + \sum _{s=t+1}^n (1-p)\frac{1}{|V_{s-1}||\bar{V}_{s-1}|}. \end{aligned}$$
(14)

Since \(\sum _{v\in V_{s-1}} (D^{\mathrm{in}}_v(s-1)+\delta _{\mathrm{in}})= s+\delta _{\mathrm{in}}|V_{s-1}|\), then (14) implies that

$$\begin{aligned}&\sum _{v}{\mathbb {P}}\left[ D^{\mathrm{in}}_v(n)=m, \bar{D}^\mathrm{in}_v(n)\ne m\, \Big {|} \,\mathcal {F}_{t-1}, J_{[n]},\bar{J}_{[n]}\right] \\&\le \sum _{s=t+1}^n \sum _{v\in V_{s-1}} \left( \mathbb {E}\left[ \frac{p\delta _{\mathrm{in}}(D^{\mathrm{in}}_v(s-1)+\delta _{\mathrm{in}})}{(s+\delta _{\mathrm{in}}|V_{s-1}|)(s+\delta _{\mathrm{in}}|\bar{V}_{s-1}|)} \, \Big {|} \,\mathcal {F}_{t-1}, {J}_{[n]},\bar{J}_{[n]}\right] \right. \\&\left. \quad \qquad + \sum _{s=t+1}^n (1-p)\frac{1}{|V_{s-1}||\bar{V}_{s-1}|}\right) \\&\le \sum _{s=t+1}^n\left( \frac{p\delta _{\mathrm{in}}}{s}+\frac{1-p}{|V_{s-1}|}\right) . \end{aligned}$$

Thus, combining all scenarios together gives that

$$\begin{aligned} \bigl |II\bigr |&\le \mathbb {E}\left[ \left| \mathbb {E}\left[ N^{\mathrm{in}}_m(n)\, \Big {|} \,\mathcal {F}_{t-1}, J_{[n]},\bar{J}_{[n]}\right] -\mathbb {E}\left[ \bar{N}^\mathrm{in}_m(n)\, \Big {|} \,\mathcal {F}_{t-1}, {J}_{[n]},\bar{J}_{[n]}\right] \right| \, \Big {|} \,\mathcal {F}_{t-1}, J_{[t]}\right] \\&\le 2\left( 1+\mathbb {E}\left[ N-t-1 + \sum _{s=t+1}^n\left( \frac{p\delta _{\mathrm{in}}}{s}+\frac{1-p}{|V_{s-1}|}\right) \Bigg | \mathcal {F}_{t-1}, J_{[t]}\right] \right) \\&= 2\left( \frac{1}{\beta }+\sum _{s=t+1}^n\frac{p\delta _{\mathrm{in}}}{s}+ \sum _{s=t+1}^n\mathbb {E}\left[ \frac{1-p}{|V_{s-1}|}\Bigg | \mathcal {F}_{t-1}, J_{[t]}\right] \right) . \end{aligned}$$

Applying the bound in Eq. (1.2) of the supplement gives that there exists some constant \(C'>0\) such that

$$\begin{aligned} \bigl |II\bigr |&\le 2/\beta + C' \sum _{s=t+1}^n s^{-1}\le C' \log (n/t), \end{aligned}$$

which further implies

$$\begin{aligned} |Z_t-Z_{t-1}|\le 1+2/\beta +C'\log (n/t). \end{aligned}$$

Then by the Azuma-Hoeffding’s inequality, we have

$$\begin{aligned} {\mathbb {P}}\left( \left| N^{\mathrm{in}}_m(n)-\mathbb {E}(N^{\mathrm{in}}_m(n))\right| \ge b\right)&\le 2 \exp \left\{ -\frac{b^2}{8 \sum _{t=1}^n(1+2/\beta +C'\log (n/t))^2}\right\} \\&\le 2\exp \left\{ -\frac{b^2}{8n (1+2/\beta +\log n)^2}\right\} . \end{aligned}$$

Then the claim in (12) follows by setting \(b=C\sqrt{n\log n}(1+2/\beta +\log n)\), with \(C>2\sqrt{2}\).

Then we are left with identifying the asymptotic limit of \(\mathbb {E}(N^{\mathrm{in}}_m(n))/n\). Consider the following approximation of the attachment probability:

$$\begin{aligned} \frac{p \bigl (D^{\mathrm{in}}_{i}(n) + \delta _{\mathrm{in}}\bigr )}{\bigl (1 + \delta _{\mathrm{in}}(1 - \beta ) \bigr )n} + \frac{1 - p}{(1 - \beta )n} = \frac{D^{\mathrm{in}}_{i}(n) + \delta _{\mathrm{in}}+ \frac{1 - p}{p(1 - \beta )}\bigl (1 + \delta _{\mathrm{in}}(1 - \beta )\bigr )}{\bigl (1 + \delta _{\mathrm{in}}(1 - \beta )\bigr ) n/p}. \end{aligned}$$

Recall that

$$\begin{aligned} \tilde{\delta }_\mathrm{in} = \delta _{\mathrm{in}}+ \frac{1 - p}{p(1 - \beta )}\bigl (1 + \delta _{\mathrm{in}}(1 - \beta )\bigr ) = \frac{\delta _{\mathrm{in}}}{p} + \frac{1 - p}{p(1 - \beta )}. \end{aligned}$$

Applying Chernoff bound again gives

$$\begin{aligned}&\left| \mathbb {E}\left[ \frac{p\bigl (D^{\mathrm{in}}_i(n) + \delta _{\mathrm{in}}\bigr )}{n + 1 + |V_n|\delta _{\mathrm{in}}} + \frac{1 - p}{|V_n|}\right] - \mathbb {E}\left[ \frac{D^{\mathrm{in}}_{i}(n) + \tilde{\delta }_\mathrm{in}}{\bigl (1 + \delta _{\mathrm{in}}(1 - \beta )\bigr ) n/p}\right] \right| \nonumber \\&\quad {}\le C n^{-3/2}\sqrt{\log {n}}, \end{aligned}$$
(15)

for some constant \(C > 0\). Consider a in-degree sequence \(\left\{ \tilde{D}_i^\mathrm{in}(n)\right\} \) from a directed PA network with set of parameters \((\alpha , \beta , \gamma , \tilde{\delta }_\mathrm{in}, \tilde{\delta }_\mathrm{out})\), as studied in Samorodnitsky et al. (2016); Wan et al. (2017). Establish an argument similar to Eq. (1.1) in the supplement as follows:

$$\begin{aligned} \left| \mathbb {E}\left[ \frac{p\bigl (D^{\mathrm{in}}_i(n) + \delta _{\mathrm{in}}\bigr )}{n + 1 + |V_n|\delta _{\mathrm{in}}} + \frac{1 - p}{|V_n|}\right] - \mathbb {E}\left[ \frac{\tilde{D}_i^\mathrm{in}(n) - \tilde{\delta }_\mathrm{in}}{n + 1 + |V_n|\tilde{\delta }_\mathrm{in}}\right] \right| \\ \le \tilde{C} n^{-3/2}\sqrt{\log {n}}, \end{aligned}$$

for some constant \(\tilde{C} > 0\). Note

$$\begin{aligned} {\mathbb {P}}\bigl (D^{\mathrm{in}}_i(n) = m\bigr ) = \sum _{j = m - 1}^{m} {\mathbb {P}}\bigl (D^{\mathrm{in}}_i(n) = m \, | \,D^{\mathrm{in}}_i(n - 1) = j\bigr ){\mathbb {P}}\bigl (D^{\mathrm{in}}_i(n - 1) = j\bigr ). \end{aligned}$$

By the developed Chernoff bounds, we have

$$\begin{aligned} {\mathbb {P}}\bigl (D^{\mathrm{in}}_i(n) = m\bigr ) \le {\mathbb {P}}\bigl (\tilde{D}_i^\mathrm{in}(n) = m\bigr ) + (C + \tilde{C}) \sum _{k = i}^{n} k^{-3/2} \sqrt{\log {k}}.\end{aligned}$$

Noticing that \(\sum _{k = i}^{n} k^{-3/2} \sqrt{\log {k}} < \infty \) as \(n \rightarrow \infty \), we complete the proof by applying the results derived in Wang and Resnick (2020). \(\square \)

B. Validation of MLE

From the log-likelihood function in (11), we have the following score functions for \(\delta _{\mathrm{in}},\delta _{\mathrm{out}}\) and p, respectively.

$$\begin{aligned} \frac{\partial }{\partial \delta _{\mathrm{in}}} \log L(\varvec{\theta }\, | \,\varvec{E})&= \sum _{k = 1}^{n} \frac{|V_{k - 1}|}{\bigl (p D^{\mathrm{in}}_{v_{k, 2}}(k - 1) + \delta _{\mathrm{in}}\bigr )|V_{k - 1}| + (1 - p)k} \, \mathbb {I}_{\{J_k = \{1, 2\}\}} \nonumber \\&\qquad {}- \sum _{k = 1}^{n} \frac{|V_{k - 1}|}{k + |V_{k - 1}| \delta _{\mathrm{in}}} \, \mathbb {I}_{\{J_k = \{1, 2\}\}}, \end{aligned}$$
(16)
$$\begin{aligned} \frac{\partial }{\partial \delta _{\mathrm{out}}} \log L(\varvec{\theta }\, | \,\varvec{E})&= \sum _{k = 1}^{n} \frac{|V_{k - 1}|}{\bigl (p D^{\mathrm{out}}_{v_{k, 1}}(k - 1) + \delta _{\mathrm{out}}\bigr )|V_{k - 1}| + (1 - p)k} \, \mathbb {I}_{\{J_k = \{2, 3\}\}} \nonumber \\&\qquad {}- \sum _{k = 1}^{n} \frac{|V_{k - 1}|}{k + |V_{k - 1}| \delta _{\mathrm{out}}} \, \mathbb {I}_{\{J_k = \{2, 3\}\}}, \end{aligned}$$
(17)
$$\begin{aligned} \frac{\partial }{\partial p} \log L(\varvec{\theta }\, | \,\varvec{E})&= \sum _{k = 1}^{n} \frac{\bigl (D^{\mathrm{in}}_{v_{k, 2}}(k - 1)|V_{k - 1}| - k\bigr )\mathbb {I}_{\{J_k = \{1, 2\}\}}}{\bigl (p D^{\mathrm{in}}_{v_{k, 2}}(k - 1) + \delta _{\mathrm{in}}\bigr )|V_{k - 1}| + (1 - p)k} \nonumber \\&\qquad {}+ \sum _{k = 1}^{n} \frac{\bigl (D^{\mathrm{out}}_{v_{k, 1}}(k - 1)|V_{k - 1}| - k\bigr )\mathbb {I}_{\{J_k = \{2, 3\}\}}}{\bigl (p D^{\mathrm{out}}_{v_{k, 1}}(k - 1) + \delta _{\mathrm{out}}\bigr )|V_{k - 1}| + (1 - p)k}. \end{aligned}$$
(18)

We then set the score function (16) to 0. Note that due to the randomness of \(|V_{k-1}|\), the methodology given in Wan et al. (2017) is not directly applicable. Instead, we approximate the score function (16) as follows:

$$\begin{aligned} \sum _{k = 1}^{n}&\frac{|V_{k - 1}|}{\bigl (p D^{\mathrm{in}}_{v_{k, 2}}(k - 1) + \delta _{\mathrm{in}}\bigr )|V_{k - 1}| + (1 - p)k} \, \mathbb {I}_{\{J_k = \{1, 2\}\}}\\&= \sum _{k = 1}^{n} \frac{\mathbb {I}_{\{J_k = \{1, 2\}\}}}{\bigl (p D^{\mathrm{in}}_{v_{k, 2}}(k - 1) + \delta _{\mathrm{in}}\bigr ) + (1 - p)k/|V_{k - 1}|}\\&= \frac{1}{p}\sum _{k = 1}^{n} \frac{\mathbb {I}_{\{J_k = \{1, 2\}\}}}{D^{\mathrm{in}}_{v_{k, 2}}(k - 1) + \tilde{\delta }_\text {in}} + R_\text {in}(n), \end{aligned}$$

where

$$\begin{aligned} R_\text {in}(n)&= \sum _{k = 1}^{n} \frac{\mathbb {I}_{\{J_k = \{1, 2\}\}}}{p}\left( \frac{1}{D^{\mathrm{in}}_{v_{k,2}}(k-1)+\delta _{\mathrm{in}}/p+(1-p)k/(p|V_{k-1}|)} \right. \\&\qquad \qquad \qquad {} \left. -\frac{1}{D^{\mathrm{in}}_{v_{k, 2}}(k - 1) + \tilde{\delta }_\text {in}}\right) . \end{aligned}$$

Therefore,

$$\begin{aligned} |R_\text {in}(n)|&\le \frac{1}{p}\sum _{k=1}^n \frac{(1-p)/p\left| k/|V_{k-1}|-1/(1-\beta )\right| \mathbb {I}_{\{J_k = \{1, 2\}\}}}{(D^{\mathrm{in}}_{v_{k,2}}(k-1)+\delta _{\mathrm{in}}/p+(1-p)k/|V_{k-1}|)(D^{\mathrm{in}}_{v_{k, 2}}(k - 1) + \tilde{\delta }_\text {in})}\\&\le \frac{1-p}{p^2}\sum _{k=1}^n \frac{|k/|V_{k-1}|-1/(1-\beta )|}{(\delta _{\mathrm{in}}/p+(1-p)k/|V_{k-1}|)\tilde{\delta }_\text {in}}. \end{aligned}$$

Since \(|V_{n-1}|/n\, \overset{a.s.}{\longrightarrow } \,1/(1-\beta )\), then by the Cesàro convergence of random variables, we have \(|R_\text {in}(n)|/n\, \overset{a.s.}{\longrightarrow } \,0\). Then the approximate score equation in (16) becomes

$$\begin{aligned} \frac{1}{n}\sum _{k=1}^n \frac{\mathbb {I}_{\{J_k = \{1, 2\}\}}}{D^{\mathrm{in}}_{v_{k, 2}}(k - 1) + \tilde{\delta }_\text {in}}&= \frac{1}{n}\sum _{k = 1}^{n} \frac{|V_{k - 1}|}{k + |V_{k - 1}| \delta _{\mathrm{in}}} \, \mathbb {I}_{\{J_k = \{1, 2\}\}}. \end{aligned}$$

Applying the method in Wan et al. (2017) further yields the following approximate score function:

$$\begin{aligned} \sum _{m=0}^\infty \frac{N^\text {in}_{>m}(n)/n}{m+\tilde{\delta }_\text {in}}&= \frac{\gamma }{\tilde{\delta }_\text {in}} + \frac{(\alpha +\beta )(1-\beta )}{1+\delta _{\mathrm{in}}(1-\beta )}, \end{aligned}$$
(19)

where \(N^\text {in}_{>m}(n)\) denotes the number of nodes with in-degree strictly greater than m in \(\mathcal {H}_n\).

Similarly, the score equation with respect to (17) can be approximated by

$$\begin{aligned} \sum _{m=0}^\infty \frac{N^\text {out}_{>m}(n)/n}{m+\tilde{\delta }_\text {out}}&= \frac{\alpha }{\tilde{\delta }_\text {out}} + \frac{(\beta +\gamma )(1-\beta )}{1+\delta _{\mathrm{out}}(1-\beta )}, \end{aligned}$$
(20)

with \(N^\text {out}_{>m}(n)\) being the number of nodes with out-degree strictly greater than m in \(\mathcal {H}_n\). However, with (19) and (20) available, the approximation to the third score equation in (18) leads to a deterministic solution of \(p=1\). This indicates former methods to find MLE as in Wan et al. (2017) are not able to give us the desirable results.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, T., Zhang, P. Directed hybrid random networks mixing preferential attachment with uniform attachment mechanisms. Ann Inst Stat Math 74, 957–986 (2022). https://doi.org/10.1007/s10463-022-00827-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-022-00827-5

Keywords

Navigation