Abstract
Stochastic block model (SBM) has been extensively studied for undirected network data with community structure, yet its extension to directed network, stochastic co-block model (ScBM), has only been proposed recently. The key difference of the ScBM model is to introduce out- and in-communities to capture different sending and receiving patterns among nodes. In this paper, we further extend the ScBM model so that each node may belong to multiple out- or in-communities. Particularly, we formulate the ScBM model as a generative model, where the unknown community assignment is modeled based on the exclusive or overlapped community. We also establish the corresponding identifiability of the generative ScBM model, and estimate its parameters via an efficient variational EM algorithm. The advantage of the generative ScBM model is demonstrated in a variety of simulated networks and a real political blog network.
Similar content being viewed by others
References
Abbe, E.: Community detection and stochastic block models: recent developments. J. Mach. Learn. Res. 18(1), 6446–6531 (2017)
Adamic, L.A., Glance, N.: The political blogosphere and the 2004 US election: divided they blog. In: Proceedings of the 3rd International Workshop on Link Discovery, pp. 36–43 (2005)
Aicher, C., Jacobs, A.Z., Clauset, A.: Learning latent block structure in weighted networks. J. Complex Netw. 3(2), 221–248 (2015)
Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P.: Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9(Sep), 1981–2014 (2008)
Chiang, K.Y., Hsieh, C.J., Natarajan, N., Dhillon, I.S., Tewari, A.: Prediction and clustering in signed networks: a local to global perspective. J. Mach. Learn. Res. 15(1), 1177–1213 (2014)
Coscia, M., Rossetti, G., Giannotti, F., Pedreschi, D.: Uncovering hierarchical and overlapping communities with a local-first approach. ACM Trans. Knowl. Discov. Data 9(1), 1–27 (2014)
Dai, B., Wang, J., Shen, X., Qu, A.: Smooth neighborhood recommender systems. J. Mach. Learn. Res. 20(1), 589–612 (2019)
Fister, I., Jr., Fister, I., Perc, M.: Toward the discovery of citation cartels in citation networks. Front. Phys. 4, 49 (2016)
Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)
Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(12), 7821–7826 (2002)
Guo, X., Qiu, Y., Zhang, H., Chang, X.: (2020). Randomized spectral co-clustering for large-scale directed networks. arXiv:2004.12164v2
Holland, P.W., Laskey, K.B., Leinhardt, S.: Stochastic blockmodels: first steps. Soc. Netw. 5(2), 109–137 (1983)
Jin, J., Ke, Z.T., Luo, S.: Estimating network memberships by simplex vertex hunting. arXiv:1708.07852 (2017)
Jordan, M., Ghahramani, Z., Jaakkola, T., Saul, L.: An introduction to variational methods for graphical models. Mach. Learn. 37(2), 183–233 (1999)
Jung, S., Segev, A.: Analyzing future communities in growing citation networks. Knowl. Based Syst. 69, 34–44 (2014)
Karrer, B., Newman, M.E.J.: Stochastic blockmodels and community structure in networks. Phys. Rev. E 83(1), 016107 (2011)
Kluger, Y., Basri, R., Chang, J.T., Gerstein, M.: Spectral biclustering of microarray data: coclustering genes and conditions. Genome Res. 13(4), 703–716 (2003)
Latouche, P., Birmelé, E., Ambroise, C.: Overlapping stochastic block models with application to the French political blogosphere. Ann. Appl. Stat. 5(1), 309–336 (2011)
Lazzeroni, L., Owen, A.: Plaid models for gene expression data. Stat. Sin. pp. 61–86 (2002)
Leskovec, J., Krevl, A.: SNAP datasets: Stanford large network dataset collection. http://snap.stanford.edu/data (2014)
Li, T., Levina, E., Zhu, J.: Network cross-validation by edge sampling. Biometrika 107(2), 257–276 (2020)
Linderman, S., Adams, R.: (2014). Discovering latent network structure in point process data. In: International Conference on Machine Learning, pp. 1413–1421
Malliaros, F.D., Vazirgiannis, M.: Clustering and community detection in directed networks: A survey. Phys. Rep. 533(4), 95–142 (2013)
Mariadassou, M., Robin, S., Vacher, C.: Uncovering latent structure in valued graphs: a variational approach. Ann. Appl. Stat. 4(2), 715–742 (2010)
Rohe, K., Chatterjee, S., Yu, B.: Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Stat. 39(4), 1878–1915 (2011)
Rohe, K., Qin, T., Yu, B.: Co-clustering directed graphs to discover assymmetries and directional communities. Proc. Natl. Acad. Sci. 113(45), 12679–12684 (2016)
Su, G., Kuchinsky, A., Morris, J.H., States, D.J., Meng, F.: Glay: community structure analysis of biological networks. Bioinformatics 26(24), 3135–3137 (2010)
Su, L., Lu, W., Song, R., Huang, D.: Testing and estimation of social network dependence with time to event data. J. Am. Stat. Assoc. 115(530), 570–582 (2020)
Van Laarhoven, T., Marchiori, E.: Robust community detection methods with resolution parameter for complex detection in protein protein interaction networks. In: IAPR International Conference on Pattern Recognition in Bioinformatics, pp. 1–13. Springer (2012)
Zhang, J., He, X., Wang, J.: (2021). Directed community detection with network embedding. J. Am. Stat. Assoc. 1–11
Zhang, Y., Levina, E., Zhu, J.: Detecting overlapping communities in networks using spectral methods. SIAM J. Math. Data Sci. 2(2), 265–283 (2020)
Zhao, Y.: A survey on theoretical advances of community detection in networks. Wiley Interdiscip. Rev. Comput. Stat. 9(5), e1403 (2017)
Zhao, Y., Levina, E., Zhu, J.: Consistency of community detection in networks under degree-corrected stochastic block models. Ann. Stat. 40(4), 2266–2292 (2012)
Zhou, Z., Amini, A.A.: Analysis of spectral clustering algorithms for community detection: the general bipartite setting. J. Mach. Learn. Res. 20, 47–1 (2019)
Zhou, Z., Amini, A.A.: Optimal bipartite network clustering. J. Mach. Learn. Res. 21(40), 1–68 (2020)
Acknowledgements
This research is supported in part by HK RGC grants GRF-11303918, GRF-11300919 and GRF-11304520. The authors are grateful to the co-ordinating editor and two anonymous referees for their insightful comments and constructive suggestions, which have improved the manuscript significantly.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Proof of Proposition 1. Given \(P_s({\varvec{A}};{\varvec{\theta }})\) with \({\varvec{\theta }}=({\varvec{\alpha }},{\varvec{\beta }},{\varvec{W}})\in {\varvec{\Theta }}^{\text {s}}{\setminus }{\varvec{\Theta }}_0^{\text {s}}\), we first show that \({\varvec{\alpha }}\) can be uniquely determined up to a permutation. For any in-node \(j\ne 1\), note that
which does not depend on j, and \({\varvec{W}}_{1k\cdot }\) is the k-th row of \({\varvec{W}}_1\). Thus, we define \({\varvec{\rho }}=(\rho _1,\ldots ,\rho _K)^T\) with \(\rho _k=P(a_{1j}=1|T_{1k}=1)\) for \(1\le k\le K\). Furthermore, for any m different in-nodes \(j_1,\ldots ,j_m \in \{2,\ldots ,n\}\), we have
As \(n\ge 2K\), define \(q_1=1\) and \(q_{m+1}=P(a_{12}=1,\ldots ,a_{1,m+1}=1)\) for \(1\le m\le 2K-1\), then there holds that
Note that \(\{q_m\}_{m=1}^{2K}\) can be fully determined by \(P_s({\varvec{A}};{\varvec{\theta }})\), whereas \(\alpha _k\) and \(\rho _k\) are not due to their dependence on the unknown \({\varvec{T}}\).
Let \({\varvec{B}}=(b_{ij})_{i,j}\) be a \((K+1)\times K\) matrix with \(b_{ij}=q_{i+j-1}\), and let \({\varvec{B}}_{-i}\) denote the square matrix obtained by deleting the i-th row of \({\varvec{B}}\). Then we have \({\varvec{B}}_{-(K+1)}=\tilde{{\varvec{\rho }}}{\varvec{A}}_0{\tilde{{\varvec{\rho }}}}^T\) as \(b_{ij}=\sum _{k=1}^K\rho _k^{i-1}\alpha _k\rho _k^{j-1}\), where \(\tilde{{\varvec{\rho }}}=({\tilde{\rho }}_{ik})_{i,k=1}^K\) with \({\tilde{\rho }}_{ik}=\rho _k^{i-1}\) is an invertible Van der Monde matrix, and \({\varvec{A}}_0=\text {diag}({\varvec{\alpha }})\). Let \(D_i=\text {det}({\varvec{B}}_{-i})\) and \(f(x)=\sum _{i=1}^{K+1}(-1)^{i+K+1}D_ix^{i-1}\). Note that the j-th column of \({\varvec{B}}\) can be rewritten as \({\varvec{B}}_{\cdot j}=\sum _{k=1}^K \alpha _k \rho _k^{j-1} {\varvec{x}}_k\) with \({\varvec{x}}_k=(1,\rho _k,\ldots ,\rho _k^K)^T\), and \(f(\rho _k)\) is the determinant of the square matrix \(({\varvec{B}}, {\varvec{x}}_k)\). Since all the columns of \(({\varvec{B}}, {\varvec{x}}_k)\) are linear combinations of \(\{{\varvec{x}}_k\}_{k=1}^K\), we have \(f(\rho _k)=0\) for any k. As \(D_{K+1}\ne 0\), it follows immediately that
Since \(\{q_m\}_{m=1}^{2K}\) is fully determined given \(P_s({\varvec{A}};{\varvec{\theta }})\), so are \({\varvec{B}}\) and f(x). Therefore, it follows from (16) that \({\varvec{\rho }}\) can be fully determined up to a permutation of \(\{1,\ldots ,K\}\). It then can be solved that \({\varvec{A}}_0=\tilde{{\varvec{\rho }}}^{-1}{\varvec{B}}_{-(K+1)}(\tilde{{\varvec{\rho }}}^T)^{-1}\), whose diagonal elements \(\alpha _k\)’s can also be determined up to a permutation of \(\{1,\ldots ,K\}\).
For \({\varvec{\beta }}\), note that for any out-node \(i\ne 2\),
which does not depend on i, and \({\varvec{W}}_{1l\cdot }^T\) is the l-th row of \({\varvec{W}}_1^T\). Thus, we can define \({\varvec{\rho }}'=(\rho _1',\ldots ,\rho _L')^T\) with \(\rho _l'=P(a_{i2}=1|S_{2l}=1)\) for \(1\le l\le L\). Let \(\tilde{{\varvec{\rho }}}'=({\tilde{\rho }}'_{jl})_{j,l=1}^L\) with \({\tilde{\rho }}'_{jl}=(\rho _l')^{j-1}\) and \({\varvec{B}}_0=\text {diag}({\varvec{\beta }})\). Then as \(n\ge 2\,L\), \({\varvec{\rho }}'\) and \({\varvec{\beta }}\) can also be determined up to a permutation of \(\{1,\ldots ,L\}\) following a similar treatment as for \({\varvec{\alpha }}\).
For \({\varvec{W}}\), let \({\varvec{H}}=(h_{ij})_{i,j}\) with \(h_{ij}=P(a_{12}=\cdots =a_{1,i+1}=1, a_{32}=\cdots =a_{j+1,2}=1)\) for \(1\le i\le K\) and \(1\le j\le L\). Note that \(h_{i1}=P(a_{12}=\cdots =a_{1,i+1}=1)\). Then we have
and thus \({\varvec{H}}=\tilde{{\varvec{\rho }}}{\varvec{A}}_0{\varvec{W}}_1{\varvec{B}}_0(\tilde{{\varvec{\rho }}}')^T\). As \({\varvec{H}}\) can be fully determined by \(P_s({\varvec{A}};{\varvec{\theta }})\), it immediately follows that \({\varvec{W}}_1={\varvec{A}}_0^{-1}\tilde{{\varvec{\rho }}}^{-1}{\varvec{H}}\big ((\tilde{{\varvec{\rho }}}')^T\big )^{-1}{\varvec{B}}_0^{-1}\) can be fully determined up to permutations for its rows and columns. As \(w_{kl}=\text {logit}(w_{1kl})\), this completes the proof of Proposition 1.
Proof of Theorem 1. Suppose there exist \({\varvec{\theta }}=({\varvec{\alpha }},{\varvec{\beta }},\widetilde{{\varvec{W}}})\ne {\varvec{\theta }}'=({\varvec{\alpha }}',{\varvec{\beta }}',\widetilde{{\varvec{W}}}')\in {\varvec{\Theta }}^{\text {g}}{\setminus }{\varvec{\Theta }}_0^{\text {g}} \) such that \(P_g({\varvec{A}};{\varvec{\theta }})=P_g({\varvec{A}};{\varvec{\theta }}')\), it then suffices to show that \({\varvec{\theta }}'\) and \({\varvec{\theta }}\) are identical up to a permutation over community labels. Note that in (2),
where the last equality follows from the exclusive constraints that \({\varvec{T}}_i^T{\varvec{1}}_K={\varvec{S}}_j^T{\varvec{1}}_L=1\). Thus, Proposition 1 immediately implies that there exist permutations \(\sigma _1\) and \(\sigma _2\) on \(\{1,\ldots ,K\}\) and \(\{1,\ldots ,L\}\), such that \(\alpha _k'=\alpha _{\sigma _1(k)}\), \(\beta _l'=\beta _{\sigma _2(l)}\) and \(w_{2kl}'=w_{2\sigma _1(k)\sigma _2(l)}\) for each \(1\le k\le K\) and \(1\le l\le L\), where \(w_{2kl}=\text {logistic}(w_{kl}+u_k+v_l+w_0)\).
It then remains to show \({\tilde{w}}_{kl}'={\tilde{w}}_{\sigma _1(k)\sigma _2(l)}\) for any k and l. In fact, it follows from the definition of \({\varvec{W}}_2\) that
Taking summation over k and l on both side of (17) implies that \(w_0'=w_0\). With \(w_0'\) and \(w_0\) canceled, taking summation over k or l respectively implies that \(u_k'=u_{\sigma _1(k)}\) or \(v_l'=v_{\sigma _2(l)}\), which also leads to \(w_{kl}'=w_{\sigma _1(k)\sigma _2(l)}\). Thus, \({\tilde{w}}_{kl}'={\tilde{w}}_{\sigma _1(k)\sigma _2(l)}\) by setting \(\sigma _1(K+1)=K+1\) and \(\sigma _2(L+1)=L+1\). The desired result then follows immediately.
Proof of Lemma 1. We first show that \(\psi \) is an injective mapping. Suppose there exist \({\varvec{\theta }}=({\varvec{\alpha }},{\varvec{\beta }},\widetilde{{\varvec{W}}}), {\varvec{\theta }}'=({\varvec{\alpha }}',{\varvec{\beta }}',\widetilde{{\varvec{W}}}')\in {\varvec{\Theta }}^\text {o}\), such that \(\psi ({\varvec{\theta }})=({\varvec{\eta }},{\varvec{\zeta }},{\varvec{\Pi }})=({\varvec{\eta }}',{\varvec{\zeta }}',{\varvec{\Pi }}')=\psi ({\varvec{\theta }}')\). By the definition of \(\psi \), we have,
for any \({\varvec{b}}\in \{0,1\}^K\) and \({\varvec{c}}\in \{0,1\}^L\). Particularly, setting \({\varvec{b}}={\varvec{1}}_K\) leads to \(\prod _{k=1}^K \alpha _k=\prod _{k=1}^K\alpha _k'\), and setting \({\varvec{b}}=({\varvec{1}}_{K-1}^T,0)^T\) leads to \((1-\alpha _K)\prod _{k=1}^{K-1} \alpha _k=(1-\alpha _K')\prod _{k=1}^{K-1}\alpha _k'\). Thus, \(\alpha _K=\alpha _K'\), and then \(\prod _{k=1}^{K-1} \alpha _k^{b_k}(1-\alpha _k)^{(1-b_k)}=\prod _{k=1}^{K-1} (\alpha _k')^{b_k}(1-\alpha _k')^{(1-b_k)}\). Repeating the same treatment for \(K-1,\ldots ,1\) yields that \({\varvec{\alpha }}={\varvec{\alpha }}'\) and similar \({\varvec{\beta }}={\varvec{\beta }}'\). For \(\widetilde{{\varvec{W}}}\), setting \({\varvec{b}}={\varvec{c}}={\varvec{0}}\) leads to \(w_0=w_0'\), and setting \({\varvec{c}}={\varvec{0}}\) and \({\varvec{b}}\) with only \(b_k=1\) leads to \(u_k=u_k'\). Similarly, setting \({\varvec{b}}={\varvec{0}}\) and \({\varvec{c}}\) with only \(c_l=1\) leads to \(v_l=v_l'\), and setting \({\varvec{b}}\) and \({\varvec{c}}\) with only \(b_k=c_l=1\) leads to \(w_{kl}'=w_{kl}\). Therefore, \(\widetilde{{\varvec{W}}}=\widetilde{{\varvec{W}}}'\) and \({\varvec{\theta }}={\varvec{\theta }}'\), and thus \(\psi \) is an injective mapping.
For any \({\varvec{\theta }}\in {\varvec{\Theta }}^{\text {o}}\), we have \({\varvec{\omega }}=\psi ({\varvec{\theta }})\in {\varvec{\Omega }}\), and then
This completes the proof of Lemma 1.
Lemma 2
For any \({\varvec{\theta }}\in {\varvec{\Theta }}^{\text {o}}\), let \({\varvec{\theta }}'=g_{{\varvec{yz}}}({\varvec{\theta }})\) defined as in Section 2.2, then there exists \(\sigma =(\sigma _1,\sigma _2)\) with permutations \(\sigma _1\) and \(\sigma _2\) on \(\{1,\ldots ,2^K\}\) and \(\{1,\ldots ,2^L\}\), such that \(\psi ({\varvec{\theta }}')=\sigma \big (\psi ({\varvec{\theta }})\big )\).
Proof of Lemma 2. Let \({\varvec{\theta }}=({\varvec{\alpha }},{\varvec{\beta }},\widetilde{{\varvec{W}}})\), and \({\varvec{\theta }}'=({\varvec{\alpha }}',{\varvec{\beta }}',\widetilde{{\varvec{W}}}')=g_{{\varvec{yz}}}({\varvec{\theta }})\) for some \({\varvec{y}}\in \{0,1\}^K\) and \({\varvec{z}}\in \{0,1\}^L\), with \(g_{{\varvec{yz}}}({\varvec{\theta }})\) defined as in (10). Correspondingly, let \({\varvec{w}}=({\varvec{\eta }}, {\varvec{\zeta }}, {\varvec{\Pi }})=\psi ({\varvec{\theta }})\) and \({\varvec{w}}'=({\varvec{\eta }}', {\varvec{\zeta }}', {\varvec{\Pi }}')=\psi ({\varvec{\theta }}')\).
For any \(s\in \{1,\ldots ,2^K\}\) and \(t\in \{1,\ldots ,2^L\}\), there exist \({\varvec{b}}\in \{0,1\}^K\) and \({\varvec{c}}\in \{0,1\}^L\) such that \(s=\sum _{k=1}^Kb_k2^{k-1}+1\) and \(t=\sum _{l=1}^Lc_l2^{l-1}+1\). Then we define \(\sigma _1(s)=\sum _{k=1}^Kb_k'2^{k-1}+1\) and \(\sigma _2(t)=\sum _{l=1}^Lc_l'2^{l-1}+1\), where \({\varvec{b}}'\) and \({\varvec{c}}'\) are constructed as
It can be verified that
where the third equality follows from the equalities that
Thus, \(\psi ({\varvec{\theta }}')=\sigma \big (\psi ({\varvec{\theta }})\big )\), and the proof of Lemma 2 is completed.
Lemma 3
For any \({\varvec{\theta }}, {\varvec{\theta }}' \in {\varvec{\Theta }}^{\text {o}}\), if there exists \(\nu =(\nu _1,\nu _2)\) with \(\nu _1\) and \(\nu _2\) being permutations on \(\{1,\ldots ,K\}\) and \(\{1,\ldots ,L\}\) and \({\varvec{y}}\in \{0,1\}^K\) and \({\varvec{z}}\in \{0,1\}^L\), such that \({\varvec{\theta }}'=g_{{\varvec{yz}}}\big (h_{\nu }({\varvec{\theta }})\big )\), then we have \({\varvec{\theta }}'\sim {\varvec{\theta }}\) and \(P_o({\varvec{A}};{\varvec{\theta }}')=P_o({\varvec{A}};{\varvec{\theta }})\), where “\(\sim \)” is an equivalent relation and satisfies the reflexive, symmetric and transitive properties.
Proof of Lemma 3. For any \({\varvec{\theta }}\in {\varvec{\Theta }}^{\text {o}}\), setting \(\nu =(\nu _1,\nu _2)\) with \(\nu _1(k)=k\) and \(\nu _2(l)=l\) and \({\varvec{y}}={\varvec{z}}={\varvec{0}}\) leads to \({\varvec{\theta }}=g_{{\varvec{yz}}}\big (h_{\nu }({\varvec{\theta }})\big )\), and thus the reflexive property. As for symmetric property, it can be verified that \(g_{\nu ({\varvec{yz}})}\big (h_{\nu }({\varvec{\theta }})\big )=h_{\nu }\big (g_{{\varvec{yz}}}({\varvec{\theta }})\big )\), where \(\nu ({\varvec{yz}})=\nu _1({\varvec{y}})\nu _2({\varvec{z}})\) with \(\nu _1({\varvec{y}})_k=y_{\nu _1(k)}\) and \(\nu _2({\varvec{z}})_l=z_{\nu _2(l)}\). Therefore, if \({\varvec{\theta }}'=g_{{\varvec{yz}}}\big (h_{\nu }({\varvec{\theta }})\big )\), then \({\varvec{\theta }}=h_{\nu ^{-1}}\big (g_{{\varvec{yz}}}({\varvec{\theta }}')\big )=g_{\nu ^{-1}({\varvec{yz}})}\big (h_{\nu ^{-1}}({\varvec{\theta }}')\big )\).
Finally, let \({\varvec{\theta }}_1=g_{{\varvec{y}}{\varvec{z}}}\big (h_{\nu }({\varvec{\theta }}_0)\big )\) and \({\varvec{\theta }}_2=g_{{\varvec{y}}'{\varvec{z}}'}\big (h_{\nu '}({\varvec{\theta }}_1)\big )\), then \({\varvec{\theta }}_1=h_{\nu }\big (g_{\nu ^{-1}({\varvec{y}}{\varvec{z}})}({\varvec{\theta }}_0)\big )\) and
where \({\varvec{y}}''=(y_k'')_{k=1}^K\) with \(y_k''=|y_k'-y_{\nu '_1(k)}|\), \({\varvec{z}}''=(z_l'')_{l=1}^L\) with \(z_l''=|z_l'-z_{\nu '_2(l)}|\), \(\nu ''=(\nu _1'',\nu _2'')\) with \(\nu _1''(k)=\nu _1'\big (\nu _1(k)\big )\) and \(\nu _2''(l)=\nu _2'\big (\nu _2(l)\big )\), and the last equality follows from
and similarly \({\varvec{E}}_{\nu _2'({\varvec{z}})}{\varvec{E}}_{{\varvec{z}}'}={\varvec{E}}_{{\varvec{z}}''}\). It then follows that \({\varvec{\theta }}' \sim {\varvec{\theta }}\).
Furthermore, if \({\varvec{\theta }}'\sim {\varvec{\theta }}\), there exists \(\sigma =(\sigma _1,\sigma _2)\) with permutations \(\sigma _1\) and \(\sigma _2\) on \(\{1,\ldots ,2^K\}\) and \(\{1,\ldots ,2^L\}\), such that \(\psi ({\varvec{\theta }}')=\sigma \big (\psi ({\varvec{\theta }})\big )\), where \(\psi \) is defined as in (9). It then follows from Lemma 1 and Proposition 1 that \(P_o({\varvec{A}};{\varvec{\theta }}')=P_s\big ({\varvec{A}};\psi ({\varvec{\theta }}')\big )=P_s\big ({\varvec{A}};\psi ({\varvec{\theta }})\big )=P_o({\varvec{A}};{\varvec{\theta }})\). This completes the proof of Lemma 3.
Proof of Theorem 2. For any \({\varvec{\theta }}\in {\varvec{\Theta }}^{\text {o}}\setminus {\varvec{\Theta }}_0^{\text {o}}\), it is clear that there exist \(\nu \), \({\varvec{y}}\) and \({\varvec{z}}\) such that \(\tilde{{\varvec{\theta }}}=g_{{\varvec{yz}}}\big (h_{\nu }({\varvec{\theta }})\big )\) with \({\tilde{\alpha }}_1<\cdots<{\tilde{\alpha }}_K<\frac{1}{2}\) and \({\tilde{\beta }}_1<\cdots<{\tilde{\beta }}_L<\frac{1}{2}\). Similarly, we have \(\tilde{{\varvec{\theta }}}'=g_{{\varvec{y}}'{\varvec{z}}'}\big (h_{\nu '}({\varvec{\theta }}')\big )\) with \({\tilde{\alpha }}_1'<\cdots<{\tilde{\alpha }}_K'<\frac{1}{2}\) and \({\tilde{\beta }}_1'<\cdots<{\tilde{\beta }}_L'<\frac{1}{2}\) for some \(\nu '\), \({\varvec{y}}'\) and \({\varvec{z}}'\). It then follows from Lemma 3 that \(\tilde{{\varvec{\theta }}} \sim {\varvec{\theta }}\) and \(\tilde{{\varvec{\theta }}'} \sim {\varvec{\theta }}'\), and \(P_o({\varvec{A}};\tilde{{\varvec{\theta }}})=P_o({\varvec{A}};{\varvec{\theta }})\) and \(P_o({\varvec{A}};\tilde{{\varvec{\theta }}}')=P_o({\varvec{A}};{\varvec{\theta }}')\). Thus, it suffices to prove that \(\tilde{{\varvec{\theta }}}=\tilde{{\varvec{\theta }}'}\) given \(P_o({\varvec{A}};\tilde{{\varvec{\theta }}})=P_o({\varvec{A}};\tilde{{\varvec{\theta }}}')\).
It follows from Lemma 1 that \(P_o({\varvec{A}};\tilde{{\varvec{\theta }}})=P_o({\varvec{A}};\tilde{{\varvec{\theta }}}')\) if and only if \(P_s\big ({\varvec{A}};\psi (\tilde{{\varvec{\theta }}})\big )=P_s\big ({\varvec{A}};\psi (\tilde{{\varvec{\theta }}}')\big )\). Furthermore, as \(\tilde{{\varvec{\theta }}}\) and \(\tilde{{\varvec{\theta }}}'\notin {\varvec{\Theta }}^{\text {o}}_0\), it can be verified that \(\psi (\tilde{{\varvec{\theta }}}),\psi (\tilde{{\varvec{\theta }}}')\notin {\varvec{\Omega }}_0\). With the assumption that \(n\ge \max \{2^{K+1},2^{L+1}\}\), it follows from Proposition 1 that \(P_s\big ({\varvec{A}};\psi (\tilde{{\varvec{\theta }}})\big )=P_s\big ({\varvec{A}};\psi (\tilde{{\varvec{\theta }}}')\big )\) if and only if there exists \(\sigma =(\sigma _1,\sigma _2)\) with \(\sigma _1\) and \(\sigma _2\) being permutations on \(\{1,\ldots ,2^K\}\) and \(\{1,\ldots ,2^L\}\), such that \(\psi (\tilde{{\varvec{\theta }}}')=\sigma \big (\psi (\tilde{{\varvec{\theta }}})\big )\). Then, we have
and
Thus, \(\prod _k{\tilde{\alpha }}_k=\prod _k{\tilde{\alpha }}_k'\) as they are the smallest elements of each set in (18), and further
as they are the second smallest elements of each set in (18). Then, we have \(\frac{{\tilde{\alpha }}_K}{1-{\tilde{\alpha }}_K}=\frac{{\tilde{\alpha }}_K'}{1-{\tilde{\alpha }}_K'}\), and thus \({\tilde{\alpha }}_K={\tilde{\alpha }}_K'\). Removing \({\tilde{\alpha }}_K^{b_K}(1-{\tilde{\alpha }}_K)^{1-b_K}\) from all items in (18), it follows from similar argument that \({\tilde{\alpha }}_{K-1}={\tilde{\alpha }}_{K-1}'\), and repeating this treatment K times finally leads to \(\tilde{{\varvec{\alpha }}}=\tilde{{\varvec{\alpha }}}'\). Similarly, we also have \(\tilde{{\varvec{\beta }}}=\tilde{{\varvec{\beta }}}'\).
Next, for any \(s\in \{1,\ldots ,2^K\}\) and \(t\in \{1,\ldots ,2^L\}\), let \(s=\sum _{k=1}^Kb_k2^{k-1}+1\) and \(t=\sum _{l=1}^Lc_l2^{l-1}+1\) for some \({\varvec{b}}\in \{0,1\}^K\) and \({\varvec{c}}\in \{0,1\}^L\), and \(\sigma _1(s)=\sum _{k=1}^Kb_k'2^{k-1}+1\) and \(\sigma _2(t)=\sum _{l=1}^Lc_l'2^{l-1}+1\) for some \({\varvec{b}}'\in \{0,1\}^K\) and \({\varvec{c}}'\in \{0,1\}^L\). It then follows from \(\psi (\tilde{{\varvec{\theta }}}')=\sigma \big (\psi (\tilde{{\varvec{\theta }}})\big )\) that
which leads to
It then follows from the definition of \({\varvec{\Theta }}_0^{\text {o}}\) immediately that \({\varvec{b}}'={\varvec{b}}\), and thus \(\sigma _1(s)=s\). Similarly, \(\sigma _2(t)=t\), and thus \(\psi (\tilde{{\varvec{\theta }}}')=\psi (\tilde{{\varvec{\theta }}})\). The desired result then follows from Lemma 1 immediately.
Rights and permissions
About this article
Cite this article
Zhang, J., Wang, J. Identifiability and parameter estimation of the overlapped stochastic co-block model. Stat Comput 32, 57 (2022). https://doi.org/10.1007/s11222-022-10114-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-022-10114-1