Dynamic top-k influence maximization in social networks

Zhang, Binbin; Wang, Hao; U, Leong Hou

doi:10.1007/s10707-020-00419-6

Dynamic top-k influence maximization in social networks

Published: 24 July 2020

Volume 26, pages 323–346, (2022)
Cite this article

GeoInformatica Aims and scope Submit manuscript

571 Accesses
Explore all metrics

Abstract

The problem of top-k influence maximization is to find the set of k users in a social network that can maximize the spread of influence under certain influence propagation model. This paper studies the influence maximization problem together with network dynamics. For example, given a real-life social network that evolves over time, we want to find k most influential users on everyday basis. This dynamic influence maximization problem has wide applications in practice. However, to our best knowledge, there is little prior work that studies this problem. Applying existing influence maximization algorithms at every time step provides a straightforward solution to the dynamic top-k influence maximization problem. Such a solution is, however, inefficient as it completely ignores the smoothness of network change. By analyzing two real social networks, Brightkite and Gowalla, we observe that the top-k influential set, as well as its influence value, does not change dramatically over time. Hence, it is possible to find the new top-k influential set by updating the previous one. We propose an efficient incremental update framework that takes advantage of such smoothness of network change. The proposed method achieves the same approximation ratio of 1 − e^− 1 as its state-of-the-art static counterparts. Our experiments show that the proposed method outperforms the straightforward solution by a wide margin.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The homophily principle in social network analysis: A survey

Article 18 January 2022

Complex Networks: a Mini-review

Article 13 July 2020

A new semi-local centrality for identifying influential nodes based on local average shortest path with extended neighborhood

Article Open access 13 April 2024

Notes

https://www.brandwatch.com/blog/react-50-most-influential-people-on-twitter-2019/
https://twitter.com/
To be exact, I(Φ₁) = 2.8224942 and I(Φ₂) = 4.117388.
Given a directed graph G = (V,E), the reachability of a vertex v ∈ V is the number of vertices that are reachable from v (inclusive of v).
SNAP: http://snap.stanford.edu/index.html
We are aware of the logical fallacy of post hoc ergo propter hoc. Nonetheless, we believe that the simple assumption here reflects certain aspects of the reality. After all, deliberate design of more sophisticated models will be a detraction from our main focus in this paper.
We put E_δ in the notation $G_{\delta }=\left (V_{s},E_{s},E_{\delta }\right )$ to emphasize the fact that G_s “grows out of” E_δ, and E_δ is of essential importance in G_s, as we will see shortly in Lemmata 1-2.
Note that $d^{\prime }_{v}\equiv {\Delta } I^{\prime }({\varPhi }_{i-1})$ for any v. In our implementation, we override the function by passing only one value instead of a vector.
We remove the experiments of varying k due to limitations of space and analogous performance to other sensitivity experiments.
http://www.arXiv.org
Due to the limitation of space, we remove the experiments of varying this parameter p. The results are very similar to what we present in this section.

References

Bao H, Chang EC (2010) Adheat: an influence-based diffusion model for propagating hints to match ads. In: WWW
Chen W, Wang C, Wang Y (2010) Scalable influence maximization for prevalent viral marketing in large-scale social networks. In: KDD
Goyal A, Bonchi F, Lakshmanan LVS (2011) A data-based approach to social influence maximization. PVLDB 5(1):73–84
Google Scholar
Yuan Y, Wang G, Wang H, Chen L (2011) Efficient subgraph search over large uncertain graphs. PVLDB 4(11):876–886
Google Scholar
Yuan Y, Wang G, Chen L, Wang H (2012) Efficient subgraph similarity search on large probabilistic graph databases. PVLDB 5(9):800–811
Google Scholar
Yuan Y, Wang G, Jeffery YX, Chen L (2015) Efficient distributed subgraph similarity matching. VLDB J 24(3):369–394
Article Google Scholar
Yuan Y, Lian X, Chen L, Sun Y, Wang G (2016) RSKNN: kNN search on road networks by incorporating social influence. TKDE 28(6):1575–1588
Google Scholar
Chen L, Shang S, Yang C, Li J (2020) Spatial keyword search: a survey. GeoInformatica 24(1):85–106
Article Google Scholar
Chen L, Shang S (2019) Approximate spatio-temporal top-k publish/subscribe. World Wide Web Journal 22(5):2153–2175
Article Google Scholar
Shang S, Chen L, Wei Z, Jensen CS, Zheng K, Kalnis P (2018) Parallel trajectory similarity joins in spatial networks. VLDB J 27(3):395–420
Article Google Scholar
Shang S, Chen L, Zheng K, Jensen CS, Wei Z, Kalnis P (2019) Parallel trajectory-to-location join. TKDE 31(6):1194–1207
Google Scholar
Chen Y-C, Peng W-C, Lee S-Y (2012) Efficient algorithms for influence maximization in social networks. KAIS 33(3):577–601
Google Scholar
Domingos P, Richardson M (2001) Mining the network value of customers. In: KDD
Goyal A, Lu W, Lakshmanan LVS (2011) CELF++: optimizing the greedy algorithm for influence maximization in social networks. In: WWW (Companion Volume),
Kempe D, Kleinberg JM, Tardos E (2003) Maximizing the spread of influence through a social network. In: KDD
Kimura M, Saito K (2006) Tractable models for information diffusion in social networks. In: PKDD
Leskovec J, Krause A, Guestrin C, Faloutsos C, VanBriesen JM, Glance NS (2007) Cost-effective outbreak detection in networks. In: KDD
Long C, Wong RC-W (2011) Minimizing seed set for viral marketing. In: ICDM
Richardson M, Domingos R (2002) Mining knowledge-sharing sites for viral marketing. In: KDD
Cohen E (1997) Size-estimation framework with applications to transitive closure and reachability. JCSS 55(3):441–453
MathSciNet MATH Google Scholar
Cheng S, Shen H, Huang J, Zhang G, Cheng X (2013) Staticgreedy: solving the scalability-accuracy dilemma in influence maximization. In: CIKM
Kim J, Kim S-K, Yu H (2013) Scalable and parallelizable processing of influence maximization for large-scale social networks. In: ICDE
Liu X, Li M, Li S, Peng S, Liao X, Lu X (2014) IMGPU: Gpu-accelerated influence maximization in large-scale social networks. TPDS 25(1):136–145
Google Scholar
Zhuang H, Sun Y, Tang J, Zhang J, Sun X (2013) Influence maximization in dynamic social networks. In: ICDM
Cho E, Myers SA, Leskovec J (2011) Friendship and mobility: user movement in location-based social networks. In: KDD
Tarjan RE (1972) Depth-first search and linear graph algorithms. SICOMP 1(2):146–160
Article MathSciNet Google Scholar
Wang C, Tang J, Sun J, Han J (2011) Dynamic social influence analysis through time-dependent factor graphs. In: ASONAM
Eppstein D (1994) Finding the k shortest paths. In: FOCS

Download references

Author information

Authors and Affiliations

School of Information Science and Engineering, Yunnan University, Kunming, China
Binbin Zhang
Inception Institute of Artificial Intelligence, Abu Dhabi, United Arab Emirates
Hao Wang
State Key Laboratory of Internet of Things for Smart City, Department of Computer, Information Science, University of Macau, Zhuhai, China
Leong Hou U

Authors

Binbin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Leong Hou U
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Hao Wang or Leong Hou U.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: : Proof of Lemma 3

Consider the set G_s of all possible random subgraphs of G. The size of G_s is n = 2^|E| since each edge is considered independently of the others. By definition, the influence of $U\subseteq V$ is

$$ \begin{array}{@{}rcl@{}} I(U|G)=\sum\limits_{i=1}^{n}p_{i}\cdot r\left( U\left|G_{i}\right.\right), \end{array} $$

where G_i = (V,E_i) ∈G_s and p_i is its occurring probability,

$$ \begin{array}{@{}rcl@{}} p_{i}=\underset{e_{u,v}\in E_{i}}{\prod}p_{u,v}\cdot\underset{e_{u,v}\not\in E_{i}}{\prod}\left( 1-p_{u,v}\right). \end{array} $$

Similarly, there is a set $\mathbf {G}_{s}^{+}$ containing all the $2^{\left |E^{+}\right |}$ possible random graphs of G⁺. Each G_i ∈G_s is also a possible random subgraph of G⁺, though its occurring probability may change since p_u,v’s may change.

$\mathbf {G}_{s}^{+}$ can be constructed by augmenting G_i’s. Consider the set of newly added edges in G⁺. (There are in total $c=\left |E^{+}\setminus E\right |$ such edges.) In each G_i, we enumerate all possible m = 2^c occurrence patterns of these newly added edges. As a consequence, each G_i ∈G_s corresponds to a clutter of m subgraphs in $\mathbf {G}_{s}^{+}$, denoted as $\left \{G^{+}_{i1},G^{+}_{i2},\cdots ,G^{+}_{im}\right \}$. The probability distribution within the clutter is the same for every G_i; let q_j be the occurring probability of $G^{+}_{ij}$ given G_i. Now we get

$$ \begin{array}{@{}rcl@{}} I(U|G^{+})=\sum\limits_{i=1}^{n}p^{\prime}_{i}\left( \sum\limits_{j=1}^{m} q_{j}\cdot r\left( U\left|G^{+}_{ij}\right.\right)\right), \end{array} $$

and thus

$$ \begin{array}{@{}rcl@{}} {\Delta} I(U)=\sum\limits_{i=1}^{n}p^{\prime}_{i}\left( \sum\limits_{j=1}^{m} q_{j}\cdot r\left( U\left|G^{+}_{ij}\right.\right)\right)-\sum\limits_{i=1}^{n}p_{i}\cdot r\left( U\left|G_{i}\right.\right). \end{array} $$

Based on the above equation, we can revise ΔI(S ∪ T) ≤ I(S) + I(T) into the following inequality.

$$ \begin{array}{@{}rcl@{}} \sum\limits_{i=1}^{n} p^{\prime}_{i}\left( \sum\limits_{j=1}^{m} q_{j}\cdot\beta(S,T,G^{+}_{ij})\right)\geq \sum\limits_{i=1}^{n} p_{i}\cdot\beta(S,T,G_{i}), \end{array} $$

(2)

where $\beta (S,T,\cdot )=r\left (S\left |\cdot \right .\right )+r\left (T\left |\cdot \right .\right )-r\left (S\cup T\left |\cdot \right .\right )$ gives the number of vertices that are commonly reachable from S and T in a given graph. One intuitive and yet essential property of β is that if $G\subseteq G^{\prime }$ then $\beta (S,T,G)\leq \beta (S,T,G^{\prime })$.

In Eq. 2 we notice that $p^{\prime }_{i}$ could be smaller than p_i. However, whenever this happens, there must exist some ℓ such that (i) $p_{\ell }>p^{\prime }_{i}$ and (ii) $G_{i}\subseteq G_{\ell }$ thus β(S,T,G_i) ≤ β(S,T,G_ℓ). In general, graph update ΔE⁺ tends to shift the probability distribution towards subgraphs that have more edges. Therefore, we have

$$ \begin{array}{@{}rcl@{}} \sum\limits_{i=1}^{n} p^{\prime}_{i}\left( \sum\limits_{j=1}^{m} q_{j}\cdot\beta(S,T,G^{+}_{ij})\right)&\geq& \sum\limits_{i=1}^{n} p_{i}\left( \sum\limits_{j=1}^{m} q_{j}\cdot\beta(S,T,G^{+}_{ij})\right)\\ &\geq&\sum\limits_{i=1}^{n} p_{i}\cdot\beta(S,T,G_{i}). \end{array} $$

The last inequality holds because all $G^{+}_{ij}$’s are supergraphs of G_i, therefore $\beta (S,T,G^{+}_{ij})\geq \beta (S,T,G_{i})$ and so is their non-negative linear combination.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, B., Wang, H. & U, L.H. Dynamic top-k influence maximization in social networks. Geoinformatica 26, 323–346 (2022). https://doi.org/10.1007/s10707-020-00419-6

Download citation

Received: 02 May 2020
Revised: 18 June 2020
Accepted: 10 July 2020
Published: 24 July 2020
Issue Date: April 2022
DOI: https://doi.org/10.1007/s10707-020-00419-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic top-k influence maximization in social networks

Abstract

Access this article

Similar content being viewed by others

The homophily principle in social network analysis: A survey

Complex Networks: a Mini-review

A new semi-local centrality for identifying influential nodes based on local average shortest path with extended neighborhood

Notes

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Appendix: : Proof of Lemma 3

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dynamic top-k influence maximization in social networks

Abstract

Access this article

Similar content being viewed by others

The homophily principle in social network analysis: A survey

Complex Networks: a Mini-review

A new semi-local centrality for identifying influential nodes based on local average shortest path with extended neighborhood

Notes

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Appendix: : Proof of Lemma 3

Appendix: : Proof of Lemma 3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation