Identifiability and parameter estimation of the overlapped stochastic co-block model

Zhang, Jingnan; Wang, Junhui

doi:10.1007/s11222-022-10114-1

Identifiability and parameter estimation of the overlapped stochastic co-block model

Published: 28 June 2022

Volume 32, article number 57, (2022)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

535 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Stochastic block model (SBM) has been extensively studied for undirected network data with community structure, yet its extension to directed network, stochastic co-block model (ScBM), has only been proposed recently. The key difference of the ScBM model is to introduce out- and in-communities to capture different sending and receiving patterns among nodes. In this paper, we further extend the ScBM model so that each node may belong to multiple out- or in-communities. Particularly, we formulate the ScBM model as a generative model, where the unknown community assignment is modeled based on the exclusive or overlapped community. We also establish the corresponding identifiability of the generative ScBM model, and estimate its parameters via an efficient variational EM algorithm. The advantage of the generative ScBM model is demonstrated in a variety of simulated networks and a real political blog network.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Non-parametric Overlapping Community Detection

Minimum Entropy Stochastic Block Models Neglect Edge Distribution Heterogeneity

Scalable Detection of Overlapping Communities and Role Assignments in Networks via Bayesian Probabilistic Generative Affiliation Modeling

References

Abbe, E.: Community detection and stochastic block models: recent developments. J. Mach. Learn. Res. 18(1), 6446–6531 (2017)
MathSciNet Google Scholar
Adamic, L.A., Glance, N.: The political blogosphere and the 2004 US election: divided they blog. In: Proceedings of the 3rd International Workshop on Link Discovery, pp. 36–43 (2005)
Aicher, C., Jacobs, A.Z., Clauset, A.: Learning latent block structure in weighted networks. J. Complex Netw. 3(2), 221–248 (2015)
Article MathSciNet MATH Google Scholar
Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P.: Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9(Sep), 1981–2014 (2008)
MATH Google Scholar
Chiang, K.Y., Hsieh, C.J., Natarajan, N., Dhillon, I.S., Tewari, A.: Prediction and clustering in signed networks: a local to global perspective. J. Mach. Learn. Res. 15(1), 1177–1213 (2014)
MathSciNet MATH Google Scholar
Coscia, M., Rossetti, G., Giannotti, F., Pedreschi, D.: Uncovering hierarchical and overlapping communities with a local-first approach. ACM Trans. Knowl. Discov. Data 9(1), 1–27 (2014)
Article Google Scholar
Dai, B., Wang, J., Shen, X., Qu, A.: Smooth neighborhood recommender systems. J. Mach. Learn. Res. 20(1), 589–612 (2019)
MathSciNet MATH Google Scholar
Fister, I., Jr., Fister, I., Perc, M.: Toward the discovery of citation cartels in citation networks. Front. Phys. 4, 49 (2016)
Article Google Scholar
Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)
Article MathSciNet Google Scholar
Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(12), 7821–7826 (2002)
Article MathSciNet MATH Google Scholar
Guo, X., Qiu, Y., Zhang, H., Chang, X.: (2020). Randomized spectral co-clustering for large-scale directed networks. arXiv:2004.12164v2
Holland, P.W., Laskey, K.B., Leinhardt, S.: Stochastic blockmodels: first steps. Soc. Netw. 5(2), 109–137 (1983)
Article MathSciNet Google Scholar
Jin, J., Ke, Z.T., Luo, S.: Estimating network memberships by simplex vertex hunting. arXiv:1708.07852 (2017)
Jordan, M., Ghahramani, Z., Jaakkola, T., Saul, L.: An introduction to variational methods for graphical models. Mach. Learn. 37(2), 183–233 (1999)
Article MATH Google Scholar
Jung, S., Segev, A.: Analyzing future communities in growing citation networks. Knowl. Based Syst. 69, 34–44 (2014)
Article Google Scholar
Karrer, B., Newman, M.E.J.: Stochastic blockmodels and community structure in networks. Phys. Rev. E 83(1), 016107 (2011)
Article MathSciNet Google Scholar
Kluger, Y., Basri, R., Chang, J.T., Gerstein, M.: Spectral biclustering of microarray data: coclustering genes and conditions. Genome Res. 13(4), 703–716 (2003)
Article Google Scholar
Latouche, P., Birmelé, E., Ambroise, C.: Overlapping stochastic block models with application to the French political blogosphere. Ann. Appl. Stat. 5(1), 309–336 (2011)
Article MathSciNet MATH Google Scholar
Lazzeroni, L., Owen, A.: Plaid models for gene expression data. Stat. Sin. pp. 61–86 (2002)
Leskovec, J., Krevl, A.: SNAP datasets: Stanford large network dataset collection. http://snap.stanford.edu/data (2014)
Li, T., Levina, E., Zhu, J.: Network cross-validation by edge sampling. Biometrika 107(2), 257–276 (2020)
Article MathSciNet MATH Google Scholar
Linderman, S., Adams, R.: (2014). Discovering latent network structure in point process data. In: International Conference on Machine Learning, pp. 1413–1421
Malliaros, F.D., Vazirgiannis, M.: Clustering and community detection in directed networks: A survey. Phys. Rep. 533(4), 95–142 (2013)
Article MathSciNet MATH Google Scholar
Mariadassou, M., Robin, S., Vacher, C.: Uncovering latent structure in valued graphs: a variational approach. Ann. Appl. Stat. 4(2), 715–742 (2010)
Article MathSciNet MATH Google Scholar
Rohe, K., Chatterjee, S., Yu, B.: Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Stat. 39(4), 1878–1915 (2011)
Article MathSciNet MATH Google Scholar
Rohe, K., Qin, T., Yu, B.: Co-clustering directed graphs to discover assymmetries and directional communities. Proc. Natl. Acad. Sci. 113(45), 12679–12684 (2016)
Article MathSciNet MATH Google Scholar
Su, G., Kuchinsky, A., Morris, J.H., States, D.J., Meng, F.: Glay: community structure analysis of biological networks. Bioinformatics 26(24), 3135–3137 (2010)
Article Google Scholar
Su, L., Lu, W., Song, R., Huang, D.: Testing and estimation of social network dependence with time to event data. J. Am. Stat. Assoc. 115(530), 570–582 (2020)
Article MathSciNet MATH Google Scholar
Van Laarhoven, T., Marchiori, E.: Robust community detection methods with resolution parameter for complex detection in protein protein interaction networks. In: IAPR International Conference on Pattern Recognition in Bioinformatics, pp. 1–13. Springer (2012)
Zhang, J., He, X., Wang, J.: (2021). Directed community detection with network embedding. J. Am. Stat. Assoc. 1–11
Zhang, Y., Levina, E., Zhu, J.: Detecting overlapping communities in networks using spectral methods. SIAM J. Math. Data Sci. 2(2), 265–283 (2020)
Article MathSciNet MATH Google Scholar
Zhao, Y.: A survey on theoretical advances of community detection in networks. Wiley Interdiscip. Rev. Comput. Stat. 9(5), e1403 (2017)
Article MathSciNet Google Scholar
Zhao, Y., Levina, E., Zhu, J.: Consistency of community detection in networks under degree-corrected stochastic block models. Ann. Stat. 40(4), 2266–2292 (2012)
Article MathSciNet MATH Google Scholar
Zhou, Z., Amini, A.A.: Analysis of spectral clustering algorithms for community detection: the general bipartite setting. J. Mach. Learn. Res. 20, 47–1 (2019)
MathSciNet MATH Google Scholar
Zhou, Z., Amini, A.A.: Optimal bipartite network clustering. J. Mach. Learn. Res. 21(40), 1–68 (2020)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

This research is supported in part by HK RGC grants GRF-11303918, GRF-11300919 and GRF-11304520. The authors are grateful to the co-ordinating editor and two anonymous referees for their insightful comments and constructive suggestions, which have improved the manuscript significantly.

Author information

Authors and Affiliations

International Institute of Finance, School of Management, University of Science and Technology of China, Hefei, China
Jingnan Zhang
Department of Statistics, The Chinese University of Hong Kong, Kowloon, Hong Kong
Junhui Wang
School of Data Science, City University of Hong Kong, Kowloon, Hong Kong
Jingnan Zhang & Junhui Wang

Authors

Jingnan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Junhui Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jingnan Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Proof of Proposition 1. Given $P_s({\varvec{A}};{\varvec{\theta }})$ with ${\varvec{\theta }}=({\varvec{\alpha }},{\varvec{\beta }},{\varvec{W}})\in {\varvec{\Theta }}^{\text {s}}{\setminus }{\varvec{\Theta }}_0^{\text {s}}$, we first show that ${\varvec{\alpha }}$ can be uniquely determined up to a permutation. For any in-node $j\ne 1$, note that

$$\begin{aligned} P(a_{1j}=1|T_{1k}=1)&=\sum _{l=1}^LP(a_{1j}=1|T_{1k}=1,S_{jl}=1)\\&\quad P(S_{jl}=1)=\sum _{l=1}^Lw_{1kl}\beta _l={\varvec{W}}_{1k\cdot }{\varvec{\beta }}, \end{aligned}$$

which does not depend on j, and ${\varvec{W}}_{1k\cdot }$ is the k-th row of ${\varvec{W}}_1$. Thus, we define ${\varvec{\rho }}=(\rho _1,\ldots ,\rho _K)^T$ with $\rho _k=P(a_{1j}=1|T_{1k}=1)$ for $1\le k\le K$. Furthermore, for any m different in-nodes $j_1,\ldots ,j_m \in \{2,\ldots ,n\}$, we have

$$\begin{aligned} P(a_{1j_1}&=\cdots =a_{1j_m}=1|T_{1k}=1)\\&=\sum _{l_1,\ldots ,l_m=1}^LP(a_{1j_1}=\cdots =a_{1j_m}=1|T_{1k}=1,S_{j_1l_1}\\&=\cdots =S_{j_ml_m}=1)\prod _{i=1}^m\beta _{l_i}\\&=\sum _{l_1,\ldots ,l_m=1}^L\prod _{i=1}^mw_{1kl_i}\beta _{l_i}=\prod _{i=1}^m\sum _{l_i=1}^Lw_{1kl_i}\beta _{l_i}=\rho _k^m. \end{aligned}$$

As $n\ge 2K$, define $q_1=1$ and $q_{m+1}=P(a_{12}=1,\ldots ,a_{1,m+1}=1)$ for $1\le m\le 2K-1$, then there holds that

$$\begin{aligned} q_{m+1}&=\sum _{k=1}^KP(a_{12}=1,\ldots ,a_{1,m+1}=1|T_{1k}=1)P(T_{1k}=1)\\&=\sum _{k=1}^K\alpha _k\rho _k^m. \end{aligned}$$

Note that $\{q_m\}_{m=1}^{2K}$ can be fully determined by $P_s({\varvec{A}};{\varvec{\theta }})$, whereas $\alpha _k$ and $\rho _k$ are not due to their dependence on the unknown ${\varvec{T}}$.

Let ${\varvec{B}}=(b_{ij})_{i,j}$ be a $(K+1)\times K$ matrix with $b_{ij}=q_{i+j-1}$, and let ${\varvec{B}}_{-i}$ denote the square matrix obtained by deleting the i-th row of ${\varvec{B}}$. Then we have ${\varvec{B}}_{-(K+1)}=\tilde{{\varvec{\rho }}}{\varvec{A}}_0{\tilde{{\varvec{\rho }}}}^T$ as $b_{ij}=\sum _{k=1}^K\rho _k^{i-1}\alpha _k\rho _k^{j-1}$, where $\tilde{{\varvec{\rho }}}=({\tilde{\rho }}_{ik})_{i,k=1}^K$ with ${\tilde{\rho }}_{ik}=\rho _k^{i-1}$ is an invertible Van der Monde matrix, and ${\varvec{A}}_0=\text {diag}({\varvec{\alpha }})$. Let $D_i=\text {det}({\varvec{B}}_{-i})$ and $f(x)=\sum _{i=1}^{K+1}(-1)^{i+K+1}D_ix^{i-1}$. Note that the j-th column of ${\varvec{B}}$ can be rewritten as ${\varvec{B}}_{\cdot j}=\sum _{k=1}^K \alpha _k \rho _k^{j-1} {\varvec{x}}_k$ with ${\varvec{x}}_k=(1,\rho _k,\ldots ,\rho _k^K)^T$, and $f(\rho _k)$ is the determinant of the square matrix $({\varvec{B}}, {\varvec{x}}_k)$. Since all the columns of $({\varvec{B}}, {\varvec{x}}_k)$ are linear combinations of $\{{\varvec{x}}_k\}_{k=1}^K$, we have $f(\rho _k)=0$ for any k. As $D_{K+1}\ne 0$, it follows immediately that

$$\begin{aligned} f(x)=D_{K+1}\prod _{i=1}^K(x-\rho _i). \end{aligned}$$

(16)

Since $\{q_m\}_{m=1}^{2K}$ is fully determined given $P_s({\varvec{A}};{\varvec{\theta }})$, so are ${\varvec{B}}$ and f(x). Therefore, it follows from (16) that ${\varvec{\rho }}$ can be fully determined up to a permutation of $\{1,\ldots ,K\}$. It then can be solved that ${\varvec{A}}_0=\tilde{{\varvec{\rho }}}^{-1}{\varvec{B}}_{-(K+1)}(\tilde{{\varvec{\rho }}}^T)^{-1}$, whose diagonal elements $\alpha _k$’s can also be determined up to a permutation of $\{1,\ldots ,K\}$.

For ${\varvec{\beta }}$, note that for any out-node $i\ne 2$,

$$\begin{aligned} P(a_{i2}=1|S_{2l}=1)&=\sum _{k=1}^KP(a_{i2}=1|T_{ik}=1,S_{2l}=1)\\&\quad P(T_{ik}=1)=\sum _{k=1}^Kw_{1kl}\alpha _k ={\varvec{W}}_{1l\cdot }^T{\varvec{\alpha }}, \end{aligned}$$

which does not depend on i, and ${\varvec{W}}_{1l\cdot }^T$ is the l-th row of ${\varvec{W}}_1^T$. Thus, we can define ${\varvec{\rho }}'=(\rho _1',\ldots ,\rho _L')^T$ with $\rho _l'=P(a_{i2}=1|S_{2l}=1)$ for $1\le l\le L$. Let $\tilde{{\varvec{\rho }}}'=({\tilde{\rho }}'_{jl})_{j,l=1}^L$ with ${\tilde{\rho }}'_{jl}=(\rho _l')^{j-1}$ and ${\varvec{B}}_0=\text {diag}({\varvec{\beta }})$. Then as $n\ge 2\,L$, ${\varvec{\rho }}'$ and ${\varvec{\beta }}$ can also be determined up to a permutation of $\{1,\ldots ,L\}$ following a similar treatment as for ${\varvec{\alpha }}$.

For ${\varvec{W}}$, let ${\varvec{H}}=(h_{ij})_{i,j}$ with $h_{ij}=P(a_{12}=\cdots =a_{1,i+1}=1, a_{32}=\cdots =a_{j+1,2}=1)$ for $1\le i\le K$ and $1\le j\le L$. Note that $h_{i1}=P(a_{12}=\cdots =a_{1,i+1}=1)$. Then we have

$$\begin{aligned} h_{ij}&=\sum _{k,l}P(a_{12}=\cdots =a_{1,i+1}=1, a_{32}=\cdots =a_{j+1,2}\\&=1|T_{1k}=1, S_{2l}=1)P(T_{1k}=1, S_{2l}=1)\\&=\sum _{k,l}P(a_{13}=\cdots =a_{1, i+1}=1|T_{1k}=1)\\&\quad P(a_{32}=\cdots =a_{j+1, 2}=1|S_{2l}=1)w_{1kl}\alpha _k\beta _l\\&=\sum _{k,l}\rho _k^{i-1}\alpha _kw_{1kl}\beta _l(\rho _l')^{j-1}, \end{aligned}$$

and thus ${\varvec{H}}=\tilde{{\varvec{\rho }}}{\varvec{A}}_0{\varvec{W}}_1{\varvec{B}}_0(\tilde{{\varvec{\rho }}}')^T$. As ${\varvec{H}}$ can be fully determined by $P_s({\varvec{A}};{\varvec{\theta }})$, it immediately follows that ${\varvec{W}}_1={\varvec{A}}_0^{-1}\tilde{{\varvec{\rho }}}^{-1}{\varvec{H}}\big ((\tilde{{\varvec{\rho }}}')^T\big )^{-1}{\varvec{B}}_0^{-1}$ can be fully determined up to permutations for its rows and columns. As $w_{kl}=\text {logit}(w_{1kl})$, this completes the proof of Proposition 1.

Proof of Theorem 1. Suppose there exist ${\varvec{\theta }}=({\varvec{\alpha }},{\varvec{\beta }},\widetilde{{\varvec{W}}})\ne {\varvec{\theta }}'=({\varvec{\alpha }}',{\varvec{\beta }}',\widetilde{{\varvec{W}}}')\in {\varvec{\Theta }}^{\text {g}}{\setminus }{\varvec{\Theta }}_0^{\text {g}} $ such that $P_g({\varvec{A}};{\varvec{\theta }})=P_g({\varvec{A}};{\varvec{\theta }}')$, it then suffices to show that ${\varvec{\theta }}'$ and ${\varvec{\theta }}$ are identical up to a permutation over community labels. Note that in (2),

$$\begin{aligned} \gamma _{ij}&=\text {logit}(p_{ij})={\varvec{T}}_i^T{\varvec{W}}{\varvec{S}}_j+{\varvec{T}}_i^T{\varvec{U}}+{\varvec{V}}^T{\varvec{S}} _j+w_0\\&={\varvec{T}}_i^T\big ({\varvec{W}}+{\varvec{U}}{\varvec{1}}_L^T+{\varvec{1}}_K{\varvec{V}}^T+w_0{\varvec{1}}_K{\varvec{1}}_L^T\big ){\varvec{S}}_j, \end{aligned}$$

where the last equality follows from the exclusive constraints that ${\varvec{T}}_i^T{\varvec{1}}_K={\varvec{S}}_j^T{\varvec{1}}_L=1$. Thus, Proposition 1 immediately implies that there exist permutations $\sigma _1$ and $\sigma _2$ on $\{1,\ldots ,K\}$ and $\{1,\ldots ,L\}$, such that $\alpha _k'=\alpha _{\sigma _1(k)}$, $\beta _l'=\beta _{\sigma _2(l)}$ and $w_{2kl}'=w_{2\sigma _1(k)\sigma _2(l)}$ for each $1\le k\le K$ and $1\le l\le L$, where $w_{2kl}=\text {logistic}(w_{kl}+u_k+v_l+w_0)$.

It then remains to show ${\tilde{w}}_{kl}'={\tilde{w}}_{\sigma _1(k)\sigma _2(l)}$ for any k and l. In fact, it follows from the definition of ${\varvec{W}}_2$ that

$$\begin{aligned} w_{kl}'+u_k'+v_l'+w_0'=w_{\sigma _1(k)\sigma _2(l)}+u_{\sigma _1(k)}+v_{\sigma _2(l)}+w_0. \end{aligned}$$

(17)

Taking summation over k and l on both side of (17) implies that $w_0'=w_0$. With $w_0'$ and $w_0$ canceled, taking summation over k or l respectively implies that $u_k'=u_{\sigma _1(k)}$ or $v_l'=v_{\sigma _2(l)}$, which also leads to $w_{kl}'=w_{\sigma _1(k)\sigma _2(l)}$. Thus, ${\tilde{w}}_{kl}'={\tilde{w}}_{\sigma _1(k)\sigma _2(l)}$ by setting $\sigma _1(K+1)=K+1$ and $\sigma _2(L+1)=L+1$. The desired result then follows immediately.

Proof of Lemma 1. We first show that $\psi $ is an injective mapping. Suppose there exist ${\varvec{\theta }}=({\varvec{\alpha }},{\varvec{\beta }},\widetilde{{\varvec{W}}}), {\varvec{\theta }}'=({\varvec{\alpha }}',{\varvec{\beta }}',\widetilde{{\varvec{W}}}')\in {\varvec{\Theta }}^\text {o}$, such that $\psi ({\varvec{\theta }})=({\varvec{\eta }},{\varvec{\zeta }},{\varvec{\Pi }})=({\varvec{\eta }}',{\varvec{\zeta }}',{\varvec{\Pi }}')=\psi ({\varvec{\theta }}')$. By the definition of $\psi $, we have,

$$\begin{aligned}&\prod _{k=1}^K \alpha _k^{b_k}(1-\alpha _k)^{(1-b_k)}=\prod _{k=1}^K (\alpha _k')^{b_k}(1-\alpha _k')^{(1-b_k)},\\&\prod _{l=1}^L\beta _l^{c_l}(1-\beta _l)^{(1-c_l)}=\prod _{l=1}^L (\beta _l')^{c_l}(1-\beta _l')^{(1-c_l)},\\&({\varvec{b}}^T,1)\widetilde{{\varvec{W}}}({\varvec{c}}^T,1)^T=({\varvec{b}}^T,1) \widetilde{{\varvec{W}}}'({\varvec{c}}^T,1)^T \end{aligned}$$

for any ${\varvec{b}}\in \{0,1\}^K$ and ${\varvec{c}}\in \{0,1\}^L$. Particularly, setting ${\varvec{b}}={\varvec{1}}_K$ leads to $\prod _{k=1}^K \alpha _k=\prod _{k=1}^K\alpha _k'$, and setting ${\varvec{b}}=({\varvec{1}}_{K-1}^T,0)^T$ leads to $(1-\alpha _K)\prod _{k=1}^{K-1} \alpha _k=(1-\alpha _K')\prod _{k=1}^{K-1}\alpha _k'$. Thus, $\alpha _K=\alpha _K'$, and then $\prod _{k=1}^{K-1} \alpha _k^{b_k}(1-\alpha _k)^{(1-b_k)}=\prod _{k=1}^{K-1} (\alpha _k')^{b_k}(1-\alpha _k')^{(1-b_k)}$. Repeating the same treatment for $K-1,\ldots ,1$ yields that ${\varvec{\alpha }}={\varvec{\alpha }}'$ and similar ${\varvec{\beta }}={\varvec{\beta }}'$. For $\widetilde{{\varvec{W}}}$, setting ${\varvec{b}}={\varvec{c}}={\varvec{0}}$ leads to $w_0=w_0'$, and setting ${\varvec{c}}={\varvec{0}}$ and ${\varvec{b}}$ with only $b_k=1$ leads to $u_k=u_k'$. Similarly, setting ${\varvec{b}}={\varvec{0}}$ and ${\varvec{c}}$ with only $c_l=1$ leads to $v_l=v_l'$, and setting ${\varvec{b}}$ and ${\varvec{c}}$ with only $b_k=c_l=1$ leads to $w_{kl}'=w_{kl}$. Therefore, $\widetilde{{\varvec{W}}}=\widetilde{{\varvec{W}}}'$ and ${\varvec{\theta }}={\varvec{\theta }}'$, and thus $\psi $ is an injective mapping.

For any ${\varvec{\theta }}\in {\varvec{\Theta }}^{\text {o}}$, we have ${\varvec{\omega }}=\psi ({\varvec{\theta }})\in {\varvec{\Omega }}$, and then

$$\begin{aligned} P_s({\varvec{A}};{\varvec{\omega }})&=\sum _{\widetilde{{\varvec{T}}},\widetilde{{\varvec{S}}}}\Big (\prod _{1\le i,j\le n}\frac{\exp (\gamma _{ij}a_{ij})}{1+\exp (\gamma _{ij})}\Big )\Big (\prod _{i,s}\eta _{s}^{\widetilde{T}_{is}}\Big )\Big (\prod _{j,t}\zeta _{t}^{\widetilde{S}_{jt}}\Big )\\&=\sum _{{\varvec{T}},{\varvec{S}}}\Big (\prod _{1\le i,j\le n}\frac{\exp (\gamma _{ij}a_{ij})}{1+\exp (\gamma _{ij})}\Big )\Big (\prod _i\eta _{s_i}\Big )\Big (\prod _j\zeta _{t_j}\Big )\\&=\sum _{{\varvec{T}},{\varvec{S}}}\Big (\prod _{1\le i,j\le n}\frac{\exp (a_{ij}\gamma _{ij})}{1+\exp (\gamma _{ij})}\Big )\Big (\prod _{i,k}\alpha _{k}^{T_{ik}}(1-\alpha _k)^{1-T_{ik}}\Big )\\&\quad \Big (\prod _{j,l}\beta _{l}^{S_{jl}}(1-\beta _l)^{1-S_{jl}}\Big )=P_o({\varvec{A}};{\varvec{\theta }}). \end{aligned}$$

This completes the proof of Lemma 1.

Lemma 2

For any ${\varvec{\theta }}\in {\varvec{\Theta }}^{\text {o}}$, let ${\varvec{\theta }}'=g_{{\varvec{yz}}}({\varvec{\theta }})$ defined as in Section 2.2, then there exists $\sigma =(\sigma _1,\sigma _2)$ with permutations $\sigma _1$ and $\sigma _2$ on $\{1,\ldots ,2^K\}$ and $\{1,\ldots ,2^L\}$, such that $\psi ({\varvec{\theta }}')=\sigma \big (\psi ({\varvec{\theta }})\big )$.

Proof of Lemma 2. Let ${\varvec{\theta }}=({\varvec{\alpha }},{\varvec{\beta }},\widetilde{{\varvec{W}}})$, and ${\varvec{\theta }}'=({\varvec{\alpha }}',{\varvec{\beta }}',\widetilde{{\varvec{W}}}')=g_{{\varvec{yz}}}({\varvec{\theta }})$ for some ${\varvec{y}}\in \{0,1\}^K$ and ${\varvec{z}}\in \{0,1\}^L$, with $g_{{\varvec{yz}}}({\varvec{\theta }})$ defined as in (10). Correspondingly, let ${\varvec{w}}=({\varvec{\eta }}, {\varvec{\zeta }}, {\varvec{\Pi }})=\psi ({\varvec{\theta }})$ and ${\varvec{w}}'=({\varvec{\eta }}', {\varvec{\zeta }}', {\varvec{\Pi }}')=\psi ({\varvec{\theta }}')$.

For any $s\in \{1,\ldots ,2^K\}$ and $t\in \{1,\ldots ,2^L\}$, there exist ${\varvec{b}}\in \{0,1\}^K$ and ${\varvec{c}}\in \{0,1\}^L$ such that $s=\sum _{k=1}^Kb_k2^{k-1}+1$ and $t=\sum _{l=1}^Lc_l2^{l-1}+1$. Then we define $\sigma _1(s)=\sum _{k=1}^Kb_k'2^{k-1}+1$ and $\sigma _2(t)=\sum _{l=1}^Lc_l'2^{l-1}+1$, where ${\varvec{b}}'$ and ${\varvec{c}}'$ are constructed as

$$\begin{aligned} b_{k}'={\left\{ \begin{array}{ll}1-b_{k}, &{} \text {if} \quad y_k=1; \\ b_k, &{} \text {if}\quad y_k=0, \end{array}\right. } \quad \text{ and } \quad c_l'={\left\{ \begin{array}{ll}1-c_l, &{} \text {if} \quad z_l=1; \\ c_l, &{} \text {if}\quad z_l=0. \end{array}\right. } \end{aligned}$$

It can be verified that

$$\begin{aligned} \eta _{s}'&=\prod _{k=1}^K (\alpha _k')^{b_k}(1-\alpha _k')^{(1-b_k)}=\prod _{k=1}^K \alpha _k^{b_k'}(1-\alpha _k)^{(1-b_k')}\\&=\eta _{\sigma _1(s)}, \\ \zeta _{t}'&=\prod _{l=1}^L(\beta _l')^{c_l}(1-\beta _l')^{(1-c_l)}=\prod _{l=1}^L \beta _l^{c_l'}(1-\beta _l)^{(1-c_l')}=\zeta _{\sigma _2(t)}, \\ \pi _{st}'&=({\varvec{b}}^T,1)\widetilde{{\varvec{W}}}'({\varvec{c}}^T,1)^T =({\varvec{b}}^T,1){\varvec{E}}_{{\varvec{y}}}^T\widetilde{{\varvec{W}}}{\varvec{E}}_{{\varvec{z}}}({\varvec{c}}^T,1)^T\\&=\big (({\varvec{b}}')^T,1\big )\widetilde{{\varvec{W}}}\big (({\varvec{c}}')^T,1\big )^T=\pi _{\sigma _1(s)\sigma _2(t)}, \end{aligned}$$

where the third equality follows from the equalities that

$$\begin{aligned} {\varvec{E}}_{{\varvec{y}}}\big ({\varvec{b}}^T,1\big )^T&=\Big (\big ({\varvec{b}}-2\text {diag}({\varvec{y}}){\varvec{b}}+{\varvec{y}}\big )^T,1\Big )^T=\big (({\varvec{b}}')^T, 1\big )^T,\\ {\varvec{E}}_{{\varvec{z}}}\big ({\varvec{c}}^T,1\big )^T&=\Big (\big ({\varvec{c}}-2\text {diag}({\varvec{z}}){\varvec{c}}+{\varvec{z}}\big )^T,1\Big )^T=\big (({\varvec{c}}')^T, 1\big )^T. \end{aligned}$$

Thus, $\psi ({\varvec{\theta }}')=\sigma \big (\psi ({\varvec{\theta }})\big )$, and the proof of Lemma 2 is completed.

Lemma 3

For any ${\varvec{\theta }}, {\varvec{\theta }}' \in {\varvec{\Theta }}^{\text {o}}$, if there exists $\nu =(\nu _1,\nu _2)$ with $\nu _1$ and $\nu _2$ being permutations on $\{1,\ldots ,K\}$ and $\{1,\ldots ,L\}$ and ${\varvec{y}}\in \{0,1\}^K$ and ${\varvec{z}}\in \{0,1\}^L$, such that ${\varvec{\theta }}'=g_{{\varvec{yz}}}\big (h_{\nu }({\varvec{\theta }})\big )$, then we have ${\varvec{\theta }}'\sim {\varvec{\theta }}$ and $P_o({\varvec{A}};{\varvec{\theta }}')=P_o({\varvec{A}};{\varvec{\theta }})$, where “$\sim $” is an equivalent relation and satisfies the reflexive, symmetric and transitive properties.

Proof of Lemma 3. For any ${\varvec{\theta }}\in {\varvec{\Theta }}^{\text {o}}$, setting $\nu =(\nu _1,\nu _2)$ with $\nu _1(k)=k$ and $\nu _2(l)=l$ and ${\varvec{y}}={\varvec{z}}={\varvec{0}}$ leads to ${\varvec{\theta }}=g_{{\varvec{yz}}}\big (h_{\nu }({\varvec{\theta }})\big )$, and thus the reflexive property. As for symmetric property, it can be verified that $g_{\nu ({\varvec{yz}})}\big (h_{\nu }({\varvec{\theta }})\big )=h_{\nu }\big (g_{{\varvec{yz}}}({\varvec{\theta }})\big )$, where $\nu ({\varvec{yz}})=\nu _1({\varvec{y}})\nu _2({\varvec{z}})$ with $\nu _1({\varvec{y}})_k=y_{\nu _1(k)}$ and $\nu _2({\varvec{z}})_l=z_{\nu _2(l)}$. Therefore, if ${\varvec{\theta }}'=g_{{\varvec{yz}}}\big (h_{\nu }({\varvec{\theta }})\big )$, then ${\varvec{\theta }}=h_{\nu ^{-1}}\big (g_{{\varvec{yz}}}({\varvec{\theta }}')\big )=g_{\nu ^{-1}({\varvec{yz}})}\big (h_{\nu ^{-1}}({\varvec{\theta }}')\big )$.

Finally, let ${\varvec{\theta }}_1=g_{{\varvec{y}}{\varvec{z}}}\big (h_{\nu }({\varvec{\theta }}_0)\big )$ and ${\varvec{\theta }}_2=g_{{\varvec{y}}'{\varvec{z}}'}\big (h_{\nu '}({\varvec{\theta }}_1)\big )$, then ${\varvec{\theta }}_1=h_{\nu }\big (g_{\nu ^{-1}({\varvec{y}}{\varvec{z}})}({\varvec{\theta }}_0)\big )$ and

$$\begin{aligned} {\varvec{\theta }}_2&=g_{{\varvec{y}}'{\varvec{z}}'}\big (h_{\nu '}\big (h_{\nu }\big (g_{\nu {-1}({\varvec{y}}{\varvec{z}})}({\varvec{\theta }}_0)\big )\big )\big )=g_{{\varvec{y}}'{\varvec{z}}'}\big (g_{\nu '({\varvec{yz}})}\big (h_{\nu '\nu }({\varvec{\theta }}_0)\big )\big )\\&=g_{{\varvec{y}}''{\varvec{z}}''}\big (h_{\nu ''}({\varvec{\theta }}_0)\big ), \end{aligned}$$

where ${\varvec{y}}''=(y_k'')_{k=1}^K$ with $y_k''=|y_k'-y_{\nu '_1(k)}|$, ${\varvec{z}}''=(z_l'')_{l=1}^L$ with $z_l''=|z_l'-z_{\nu '_2(l)}|$, $\nu ''=(\nu _1'',\nu _2'')$ with $\nu _1''(k)=\nu _1'\big (\nu _1(k)\big )$ and $\nu _2''(l)=\nu _2'\big (\nu _2(l)\big )$, and the last equality follows from

$$\begin{aligned}&{\varvec{E}}_{\nu _1'({\varvec{y}})}{\varvec{E}}_{{\varvec{y}}'}\\&\quad =\begin{pmatrix} \Big ({\varvec{I}}-2\text {diag}\big (\nu _1'({\varvec{y}})\big )\Big )\big ({\varvec{I}}-2\text {diag}({\varvec{y}}')\big ) \\ \qquad \Big ({\varvec{I}}-2\text {diag}\big (\nu _1'({\varvec{y}})\big )\Big ){\varvec{y}}'+\nu _1'({\varvec{y}}) \\ {\varvec{0}}^T &{} 1\end{pmatrix}\\&\quad =\begin{pmatrix} {\varvec{I}}-2\text {diag}({\varvec{y}}'') &{} {\varvec{y}}'' \\ {\varvec{0}}^T &{} 1 \end{pmatrix}={\varvec{E}}_{{\varvec{y}}''}, \end{aligned}$$

and similarly ${\varvec{E}}_{\nu _2'({\varvec{z}})}{\varvec{E}}_{{\varvec{z}}'}={\varvec{E}}_{{\varvec{z}}''}$. It then follows that ${\varvec{\theta }}' \sim {\varvec{\theta }}$.

Furthermore, if ${\varvec{\theta }}'\sim {\varvec{\theta }}$, there exists $\sigma =(\sigma _1,\sigma _2)$ with permutations $\sigma _1$ and $\sigma _2$ on $\{1,\ldots ,2^K\}$ and $\{1,\ldots ,2^L\}$, such that $\psi ({\varvec{\theta }}')=\sigma \big (\psi ({\varvec{\theta }})\big )$, where $\psi $ is defined as in (9). It then follows from Lemma 1 and Proposition 1 that $P_o({\varvec{A}};{\varvec{\theta }}')=P_s\big ({\varvec{A}};\psi ({\varvec{\theta }}')\big )=P_s\big ({\varvec{A}};\psi ({\varvec{\theta }})\big )=P_o({\varvec{A}};{\varvec{\theta }})$. This completes the proof of Lemma 3.

Proof of Theorem 2. For any ${\varvec{\theta }}\in {\varvec{\Theta }}^{\text {o}}\setminus {\varvec{\Theta }}_0^{\text {o}}$, it is clear that there exist $\nu $, ${\varvec{y}}$ and ${\varvec{z}}$ such that $\tilde{{\varvec{\theta }}}=g_{{\varvec{yz}}}\big (h_{\nu }({\varvec{\theta }})\big )$ with ${\tilde{\alpha }}_1<\cdots<{\tilde{\alpha }}_K<\frac{1}{2}$ and ${\tilde{\beta }}_1<\cdots<{\tilde{\beta }}_L<\frac{1}{2}$. Similarly, we have $\tilde{{\varvec{\theta }}}'=g_{{\varvec{y}}'{\varvec{z}}'}\big (h_{\nu '}({\varvec{\theta }}')\big )$ with ${\tilde{\alpha }}_1'<\cdots<{\tilde{\alpha }}_K'<\frac{1}{2}$ and ${\tilde{\beta }}_1'<\cdots<{\tilde{\beta }}_L'<\frac{1}{2}$ for some $\nu '$, ${\varvec{y}}'$ and ${\varvec{z}}'$. It then follows from Lemma 3 that $\tilde{{\varvec{\theta }}} \sim {\varvec{\theta }}$ and $\tilde{{\varvec{\theta }}'} \sim {\varvec{\theta }}'$, and $P_o({\varvec{A}};\tilde{{\varvec{\theta }}})=P_o({\varvec{A}};{\varvec{\theta }})$ and $P_o({\varvec{A}};\tilde{{\varvec{\theta }}}')=P_o({\varvec{A}};{\varvec{\theta }}')$. Thus, it suffices to prove that $\tilde{{\varvec{\theta }}}=\tilde{{\varvec{\theta }}'}$ given $P_o({\varvec{A}};\tilde{{\varvec{\theta }}})=P_o({\varvec{A}};\tilde{{\varvec{\theta }}}')$.

It follows from Lemma 1 that $P_o({\varvec{A}};\tilde{{\varvec{\theta }}})=P_o({\varvec{A}};\tilde{{\varvec{\theta }}}')$ if and only if $P_s\big ({\varvec{A}};\psi (\tilde{{\varvec{\theta }}})\big )=P_s\big ({\varvec{A}};\psi (\tilde{{\varvec{\theta }}}')\big )$. Furthermore, as $\tilde{{\varvec{\theta }}}$ and $\tilde{{\varvec{\theta }}}'\notin {\varvec{\Theta }}^{\text {o}}_0$, it can be verified that $\psi (\tilde{{\varvec{\theta }}}),\psi (\tilde{{\varvec{\theta }}}')\notin {\varvec{\Omega }}_0$. With the assumption that $n\ge \max \{2^{K+1},2^{L+1}\}$, it follows from Proposition 1 that $P_s\big ({\varvec{A}};\psi (\tilde{{\varvec{\theta }}})\big )=P_s\big ({\varvec{A}};\psi (\tilde{{\varvec{\theta }}}')\big )$ if and only if there exists $\sigma =(\sigma _1,\sigma _2)$ with $\sigma _1$ and $\sigma _2$ being permutations on $\{1,\ldots ,2^K\}$ and $\{1,\ldots ,2^L\}$, such that $\psi (\tilde{{\varvec{\theta }}}')=\sigma \big (\psi (\tilde{{\varvec{\theta }}})\big )$. Then, we have

$$\begin{aligned}&\Big \{\prod _k{\tilde{\alpha }}_k^{b_k}(1-{\tilde{\alpha }}_k)^{1-b_k}; {\varvec{b}}\in \{0,1\}^K\Big \}\nonumber \\&\quad =\Big \{\prod _k({\tilde{\alpha }}_k')^{b_k}(1-{\tilde{\alpha }}_k')^{1-b_k}; {\varvec{b}}\in \{0,1\}^K\Big \}, \end{aligned}$$

(18)

and

$$\begin{aligned}&\Big \{\prod _l{\tilde{\beta }}_l^{c_l}(1-{\tilde{\beta }}_l)^{1-c_l}; {\varvec{c}}\in \{0,1\}^L\Big \}\nonumber \\&\quad =\Big \{\prod _l({\tilde{\beta }}_l')^{c_l}(1-{\tilde{\beta }}_l')^{1-c_l}; {\varvec{c}}\in \{0,1\}^L\Big \}. \end{aligned}$$

(19)

Thus, $\prod _k{\tilde{\alpha }}_k=\prod _k{\tilde{\alpha }}_k'$ as they are the smallest elements of each set in (18), and further

$$\begin{aligned} (1-{\tilde{\alpha }}_K)\prod _{k=1}^{K-1}{\tilde{\alpha }}_k=(1-{\tilde{\alpha }}_K')\prod _{k=1}^{K-1}{\tilde{\alpha }}_k', \end{aligned}$$

as they are the second smallest elements of each set in (18). Then, we have $\frac{{\tilde{\alpha }}_K}{1-{\tilde{\alpha }}_K}=\frac{{\tilde{\alpha }}_K'}{1-{\tilde{\alpha }}_K'}$, and thus ${\tilde{\alpha }}_K={\tilde{\alpha }}_K'$. Removing ${\tilde{\alpha }}_K^{b_K}(1-{\tilde{\alpha }}_K)^{1-b_K}$ from all items in (18), it follows from similar argument that ${\tilde{\alpha }}_{K-1}={\tilde{\alpha }}_{K-1}'$, and repeating this treatment K times finally leads to $\tilde{{\varvec{\alpha }}}=\tilde{{\varvec{\alpha }}}'$. Similarly, we also have $\tilde{{\varvec{\beta }}}=\tilde{{\varvec{\beta }}}'$.

Next, for any $s\in \{1,\ldots ,2^K\}$ and $t\in \{1,\ldots ,2^L\}$, let $s=\sum _{k=1}^Kb_k2^{k-1}+1$ and $t=\sum _{l=1}^Lc_l2^{l-1}+1$ for some ${\varvec{b}}\in \{0,1\}^K$ and ${\varvec{c}}\in \{0,1\}^L$, and $\sigma _1(s)=\sum _{k=1}^Kb_k'2^{k-1}+1$ and $\sigma _2(t)=\sum _{l=1}^Lc_l'2^{l-1}+1$ for some ${\varvec{b}}'\in \{0,1\}^K$ and ${\varvec{c}}'\in \{0,1\}^L$. It then follows from $\psi (\tilde{{\varvec{\theta }}}')=\sigma \big (\psi (\tilde{{\varvec{\theta }}})\big )$ that

$$\begin{aligned} \eta _s'=\prod _k{\tilde{\alpha }}_k^{b_k}(1-{\tilde{\alpha }}_k)^{1-b_k}=\prod _k{\tilde{\alpha }}_k^{b_k'}(1-{\tilde{\alpha }}_k)^{1-b_k'}=\eta _{\sigma _1(s)}, \end{aligned}$$

which leads to

$$\begin{aligned} \sum _kb_k\log \Big (\frac{{\tilde{\alpha }}_k}{1-{\tilde{\alpha }}_k}\Big )=\sum _kb_k'\log \Big (\frac{{\tilde{\alpha }}_k}{1-{\tilde{\alpha }}_k}\Big ). \end{aligned}$$

It then follows from the definition of ${\varvec{\Theta }}_0^{\text {o}}$ immediately that ${\varvec{b}}'={\varvec{b}}$, and thus $\sigma _1(s)=s$. Similarly, $\sigma _2(t)=t$, and thus $\psi (\tilde{{\varvec{\theta }}}')=\psi (\tilde{{\varvec{\theta }}})$. The desired result then follows from Lemma 1 immediately.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, J., Wang, J. Identifiability and parameter estimation of the overlapped stochastic co-block model. Stat Comput 32, 57 (2022). https://doi.org/10.1007/s11222-022-10114-1

Download citation

Received: 30 August 2021
Accepted: 23 May 2022
Published: 28 June 2022
DOI: https://doi.org/10.1007/s11222-022-10114-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Identifiability and parameter estimation of the overlapped stochastic co-block model

Abstract

Access this article

Similar content being viewed by others

Non-parametric Overlapping Community Detection

Minimum Entropy Stochastic Block Models Neglect Edge Distribution Heterogeneity

Scalable Detection of Overlapping Communities and Role Assignments in Networks via Bayesian Probabilistic Generative Affiliation Modeling

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Lemma 2

Lemma 3

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Identifiability and parameter estimation of the overlapped stochastic co-block model

Abstract

Access this article

Similar content being viewed by others

Non-parametric Overlapping Community Detection

Minimum Entropy Stochastic Block Models Neglect Edge Distribution Heterogeneity

Scalable Detection of Overlapping Communities and Role Assignments in Networks via Bayesian Probabilistic Generative Affiliation Modeling

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Lemma 2

Lemma 3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation