Abstract
With the rapid development in artificial intelligence and mobile networks, the past decade has witnessed the flourish of social media, and information diffusion popularity prediction in social media has attracted wide attention in both academics and industrials. However, existing popularity prediction methods either rely heavily on human experience to handcraft the features, designate the generative model, or largely depend on the underlying user relation network for embedding learning. Motivated by the above observation, this paper studies the precise prediction of the information diffusion popularity only based on early repost information, given that the underlying user relation network is unknown. To solve this problem, we propose RNe2Vec (repost network to vector), a repost network embedding-based diffusion popularity prediction algorithm. Specifically, we first build a repost network from the early repost data, and then use biased random walks to generate node sequences, in which we elaborately design walking rules to capture different repost behaviors. After that we employ the skip-gram method to learn low dimensional node vectors from the node sequences. Finally, we apply PCA (principal component analysis) algorithm on the node vectors for dimensionality reduction, and combine the embedding features with handcrafted features to train the downstream machine learning models. Experimental results on a microblog dataset show that incorporating network embedding features can significantly improve the overall prediction accuracy.
Similar content being viewed by others
References
Hjorth L, Hinton S (2019) Understanding Social Media. SAGE, Thousand Oaks
Hongchun W, Shang J, Zhou S, Feng Y, Qiang B, Xie W (2018) LAIM: a linear time iterative approach for efficient influence maximization in large-scale networks. IEEE Access 6:44221–44234
Wang W, Liu Q-H, Liang J, Yanqing H, Zhou T (2019) Coevolution spreading in complex networks. Phys Rep 820:1–51
Pan L, Wang W, Cai S, Zhou T (2019) Optimal interlayer structure for promoting spreading of the susceptible-infected-susceptible model in two-layer networks. Phys Rev E 100(2):022316
Li X, Han S, Zhao L, Gong C, Liu X (2017) New dandelion algorithm optimizes extreme learning machine for biomedical classification problems. In: Computational intelligence and neuroscience 2017
Zhao L, Liu Y, Al-Dubai A, Zomaya AY, Min G, Hawbani A (2020) A novel generation adversarial network-based vehicle trajectory prediction method for intelligent vehicular networks. IEEE Internet Things J
Shang J, Zhou S, Li X, Liu L, Hongchun W (2017) CoFIM: a community-based framework for influence maximization on large-scale networks. Knowl Based Syst 117:88–100
Liu Y, Wei B, Yuxian D, Xiao F, Deng Y (2016) Identifying influential spreaders by weight degree centrality in complex networks. Chaos Solitons Fractals 86:1–7
Maurya SK, Liu X, Murata T (2019) Fast approximations of betweenness centrality with graph neural networks. In: Proceedings of the 28th ACM international conference on information and knowledge management, pp 2149–2152
Farajtabar M, Wang Y, Gomez-Rodriguez M, Li S, Zha H, Song L (2017) Coevolve: a joint point process model for information diffusion and network evolution. J Mach Learn Res 18(1):1305–1353
Rizoiu M-A, Mishra S, Kong Q, Carman M, Xie L (2018) SIR-Hawkes: linking epidemic models and Hawkes processes to model diffusions in finite populations. In: Proceedings of the 2018 world wide web conference, pp 419–428
Mishra S, Rizoiu M-A, Xie L (2016) Feature driven and point process approaches for popularity prediction. In: Proceedings of the 25th ACM international on conference on information and knowledge management, pp 1069–1078
Bourigault S, Lamprier S, Gallinari P (2016) Representation learning for information diffusion through social networks: an embedded cascade model. In: Proceedings of the ninth ACM international conference on web search and data mining, pp 573–582
Gao S, Pang H, Gallinari P, Guo J, Kato N (2017) A novel embedding method for information diffusion prediction in social network big data. IEEE Trans Ind Inf 13(4):2097–2105
Cui P, Wang X, Pei J, Zhu W (2018) A survey on network embedding. IEEE Trans Knowl Data Eng 31(5):833–852
Cai X, Shang J, Jin Z, Liu F, Qiang B, Xie W, Zhao L (2020) DBGE: employee turnover prediction based on dynamic bipartite graph embedding. IEEE Access 8:10390–10402
Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 701–710
Liu C, Li Y, Fei H, Li P (2019) Deep skip-gram networks for text classification. In: Proceedings of the 2019 SIAM international conference on data mining. SIAM, pp 145–153
Liu Q, Xiang B, Yuan NJ, Chen E, Xiong H, Zheng Y, Yang Yu (2017) An influence propagation view of pagerank. ACM Trans Knowl Discov Data (TKDD) 11(3):1–30
Chen X, Tan M, Zhao J, Yang T, Duzhi W, Zhao R (2019) Identifying influential nodes in complex networks based on a spreading influence related centrality. Phys A 536:122481
Zhang X, Chen X, Chen Y, Wang S, Li Z, Xia J (2015) Event detection and popularity prediction in microblogging. Neurocomputing 149:1469–1480
Lv R, Zang C, Chan WKV, Zhu W (2019) Analyzing WeChat diffusion cascade: pattern discovery and prediction. In: INFORMS international conference on service science. Springer, Berlin, pp 379–390
Hoang TBN, Mothe J (2018) Predicting information diffusion on twitter-analysis of predictive features. J Comput Sci 28:257–264
Elsharkawy S, Hassan G, Nabhan T, Roushdy M (2016) Towards feature selection for cascade growth prediction on twitter. In: Proceedings of the 10th international conference on informatics and systems, pp 166–172
Gao J, Shen H, Liu S, Cheng X (2016) Modeling and predicting retweeting dynamics via a mixture process. In: Proceedings of the 25th international conference companion on world wide web, pp 33–34
Wang Y, Zhang Z-M, Peng Z-S, Duan Y-Y, Gao Z-Q (2017) A cascading diffusion prediction model in micro-blog based on multi-dimensional features. In: International conference on emerging internetworking, data & web technologies. Springer, Berlin, pp 734–746
Wang Z, Chen C, Li W (2019) Information diffusion prediction with network regularized role-based user representation learning. ACM Trans Knowl Discov Data (TKDD) 13(3):1–23
Zhang Y, Lyu T, Zhang Y (2018) Cosine: community-preserving social network embedding from information diffusion cascades. In: Thirty-second AAAI conference on artificial intelligence
Zhao NY, Lin T, Philip SY (2020) Deep collaborative embedding for information cascade prediction. Knowl Based Syst 193:105502
Qiu J, Dong Y, Ma H, Li J, Wang K, Tang J (2018) Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec. In: Proceedings of the eleventh ACM international conference on web search and data mining, pp 459–467
Wang S, Zhou W, Jiang C (2020) A survey of word embeddings based on deep learning. Computing 102(3):717–740
Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864
Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web, pp 1067–1077
Yang D, Qu B, Yang J, Cudre-Mauroux P (2019) Revisiting user mobility and social relationships in lbsns: a hypergraph embedding approach. In: The world wide web conference, pp 2147–2157
Zhou L, Yang Y, Ren X, Wu F, Zhuang Y (2018) Dynamic network embedding by modeling triadic closure process. In: Thirty-second AAAI conference on artificial intelligence
Sheikh N, Kefato Z, Montresor A (2019) gat2vec: representation learning for attributed graphs. Computing 101(3):187–209
Cai H, Zheng VW, Chang KC-C (2018) A comprehensive survey of graph embedding: problems, techniques, and applications. IEEE Trans Knowl Data Eng 30(9):1616–1637
Acknowledgements
This work was supported in part by: National Natural Science Foundation of China (Nos. 61702059, 61966008), Fundamental Research Funds for the Central Universities (Nos. 2019CDXYJSJ0021, 2020CDCGJSJ041), Frontier and Application Foundation Research Program of Chongqing City (No. cstc2018jcyjAX0340).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Shang, J., Huang, S., Zhang, D. et al. RNe2Vec: information diffusion popularity prediction based on repost network embedding. Computing 103, 271–289 (2021). https://doi.org/10.1007/s00607-020-00858-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-020-00858-x
Keywords
- Information diffusion
- Mobile networks
- Social network analysis
- Network embedding
- Popularity prediction
- Artificial intelligence