Dynamic network embedding via incremental skip-gram with negative sampling

Peng, Hao; Li, Jianxin; Yan, Hao; Gong, Qiran; Wang, Senzhang; Liu, Lin; Wang, Lihong; Ren, Xiang

doi:10.1007/s11432-018-9943-9

Dynamic network embedding via incremental skip-gram with negative sampling

Research Paper
Published: 18 September 2020

Volume 63, article number 202103, (2020)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Hao Peng^1,2,
Jianxin Li^1,2,
Hao Yan^1,2,
Qiran Gong²,
Senzhang Wang³,
Lin Liu²,
Lihong Wang⁴ &
…
Xiang Ren⁵

342 Accesses
22 Citations
Explore all metrics

Abstract

Network representation learning, as an approach to learn low dimensional representations of vertices, has attracted considerable research attention recently. It has been proven extremely useful in many machine learning tasks over large graph. Most existing methods focus on learning the structural representations of vertices in a static network, but cannot guarantee an accurate and efficient embedding in a dynamic network scenario. The fundamental problem of continuously capturing the dynamic properties in an efficient way for a dynamic network remains unsolved. To address this issue, we present an efficient incremental skip-gram algorithm with negative sampling for dynamic network embedding, and provide a set of theoretical analyses to characterize the performance guarantee. Specifically, we first partition a dynamic network into the updated, including addition/deletion of links and vertices, and the retained networks over time. Then we factorize the objective function of network embedding into the added, vanished and retained parts of the network. Next we provide a new stochastic gradient-based method, guided by the partitions of the network, to update the nodes and the parameter vectors. The proposed algorithm is proven to yield an objective function value with a bounded difference to that of the original objective function. The first order moment of the objective difference converges in order of \(\mathbb{O}(\frac{1}{n^{2}})\), and the second order moment of the objective difference can be stabilized in order of \(\mathbb{O}(1)\). Experimental results show that our proposal can significantly reduce the training time while preserving the comparable performance. We also demonstrate the correctness of the theoretical analysis and the practical usefulness of the dynamic network embedding. We perform extensive experiments on multiple real-world large network datasets over multi-label classification and link prediction tasks to evaluate the effectiveness and efficiency of the proposed framework, and up to 22 times speedup has been achieved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evolving network representation learning based on random walks

Article Open access 18 March 2020

Representation Learning for Large-Scale Dynamic Networks

Evolving Node Embeddings for Dynamic Exploration of Network Topologies

References

Hamilton W L, Ying R, Leskovec J. Representation learning on graphs: methods and applications. In: Proceedings of IEEE Data Engineering Bulletin, 2017
Cavallari S, Zheng V W, Cai H Y, et al. Learning community embedding with community detection and node embedding on graphs. In: Proceedings of ACM International Conference on Information and Knowledge Management, 2017. 377–386
Shi C, Hu B B, Zhao W X, et al. Heterogeneous information network embedding for recommendation. IEEE Trans Knowl Data Eng, 2019, 31: 357–370
Article Google Scholar
Hu R J, Aggarwal C C, Ma S, et al. An embedding approach to anomaly detection. In: Proceedings of 2016 IEEE 32nd International Conference on Data Engineering (ICDE), Helsinki, 2016. 385–396
Grover A, Leskovec J. node2vec: scalable feature learning for networks. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2016. 855–864
Perozzi B, Al-Rfou R, Skiena S. Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2014. 701–710
Chapter Google Scholar
Tang J, Qu M, Wang M Z, et al. Line: large-scale information network embedding. In: Proceedings of International World Wide Web Conference, 2015. 1067–1077
Tang J, Qu M, Mei Q Z. Pte: predictive text embedding through large-scale heterogeneous text networks. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2015. 1165–1174
He Y, Li J X, Song Y Q, et al. Time-evolving text classification with deep neural networks. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, 2018. 2241–2247
Ren X, He W Q, Qu M, et al. Label noise reduction in entity typing by heterogeneous partial-label embedding. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2016. 1825–1834
Wang D X, Cui P, Zhu W W. Structural deep network embedding. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2016. 1225–1234
Cui P, Wang X, Pei J, et al. A survey on network embedding. IEEE Trans Knowl Data Eng, 2019, 31: 833–852
Article Google Scholar
Li C Z, Wang S Z, Yang D J, et al. PPNE: property preserving network embedding. In: Proceedings of International Conference on Database Systems for Advanced Applications. Berlin: Springer, 2017. 163–179
Chapter Google Scholar
Yang D J, Wang S Z, Li C Z, et al. From properties to links: deep network embedding on incomplete graphs. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. New York: ACM, 2017. 367–376
Chapter Google Scholar
Zhang H M, Qiu L W, Yi L L, et al. Scalable multiplex network embedding. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, 2018. 3082–3088
Trivedi R, Dai H J, Wang Y C, et al. Know-evolve: deep temporal reasoning for dynamic knowledge graphs. In: Proceedings of International Conference on Machine Learning, 2017. 3462–3471
Zuo Y, Liu G N, Lin H, et al. Embedding temporal network via neighborhood formation. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2018. 2857–2866
Zhu L H, Guo D, Yin J M, et al. Scalable temporal latent space inference for link prediction in dynamic social networks. IEEE Trans Knowl Data Eng, 2016, 28: 2765–2777
Article Google Scholar
Hamilton W L, Ying R, Leskovec J. Inductive representation learning on large graphs. In: Proceedings of Annual Conference on Neural Information Processing Systems, 2017
Chen J F, Zhang Q, Huang X J. Incorporate group information to enhance network embedding. In: Proceedings of ACM International Conference on Information and Knowledge Management, 2016. 1901–1904
Cao S S, Lu W, Xu Q K. Grarep: learning graph representations with global structural information. In: Proceedings of ACM International Conference on Information and Knowledge Management, 2015. 891–900
Yang C, Sun M S, Liu Z Y, et al. Fast network embedding enhancement via high order proximity approximation. In: Proceedings of International Joint Conference on Artificial Intelligence, 2017
Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality. In: Proceedings of Annual Conference on Neural Information Processing Systems, 2013. 1–9
Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space. 2013. arXiv: 1301.3781
Morin F, Bengio Y. Hierarchical probabilistic neural network language model. In: Proceedings of International Conference on Artificial Intelligence and Statistics, 2005, 5: 246–252
Google Scholar
Gutmann M U, Hyvarinen A. Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. J Mach Learn Res, 2012, 13: 307–361
MathSciNet MATH Google Scholar
Levy O, Goldberg Y. Neural word embedding as implicit matrix factorization. In: Proceedings of Advances in Neural Information Processing Systems 27 (NIPS 2014), 2014
Mnih A, Teh Y W. A fast and simple algorithm for training neural probabilistic language models. In: Proceedings of the 29th International Coference on Machine Learning, 2012. 419–426
Ribeiro L F R, Saverese P H P, Figueiredo D R. struc2vec: learning node representations from structural identity. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2017. 385–394
Donnat C, Zitnik M, Hallac D, et al. Learning structural node embeddings via diffusion wavelets. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2018. 1320–1329
Li J D, Dani H, Hu X, et al. Attributed network embedding for learning in a dynamic environment. In: Proceedings of ACM International Conference on Information and Knowledge Management, 2017. 387–396
Jian L, Li J D, Liu H. Toward online node classification on streaming networks. In: Proceedings of International Conference on Data Mining and Knowledge Discovery, 2018. 231–257
Xu K S, Hero A O. Dynamic stochastic blockmodels: statistical models for time-evolving networks. In: Proceedings of International Conference on Social Computing, Behavioral-Cultural Modeling & Prediction and Behavior Representation in Modeling and Simulation, 2013. 201–210
Zhou L, Yang Y, Ren X, et al. Dynamic network embedding by modelling triadic closure process. In: Proceedings of AAAI Conference on Artificial Intelligence, 2018
Du L, Wang Y, Song G J, et al. Dynamic network embedding: an extended approach for skip-gram based network embedding. In: Proceedings of International Joint Conference on Artificial Intelligence, 2018. 2086–2092
Peng H, Li J X, Song Y Q, et al. Incrementally learning the hierarchical softmax function for neural language models. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017
Kaji N, Kobayashi H. Incremental skip-gram model with negative sampling. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, 2017
Rudolph M, Blei D. Dynamic embeddings for language evolution. In: Proceedings of International World Wide Web Conferences Steering Committee, 2018. 1003–1011
Peng H, Bao M J, Li J X, et al. Incremental term representation learning for social network analysis. Future Generation Comput Syst, 2018, 86: 1503–1512
Article Google Scholar
Barbier G, Liu H. Data mining in social media. In: Social network data analytics. Boston: Springer, 2011. 327–352
Chapter Google Scholar
Tang L, Liu H. Relational learning via latent social dimensions. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2009. 817–826
Leskovec J, Mcauley J J. Learning to discover social circles in ego networks. In: Proceedings of Annual Conference on Neural Information Processing Systems, 2012
Leskovec J, Kleinberg J, Faloutsos C. Graph evolution: densification and shrinking diameters. ACM Trans Knowl Discov Data, 2007, 1: 2
Article Google Scholar
Yang J, Leskovec J. Defining and evaluating network communities based on ground-truth. In: Proceedings of ACM SIGKDD Workshop on Mining Data Semantics, 2012. 181–213
Fan R E, Chang K W, Hsieh C J, et al. Liblinear: a library for large linear classification. J Mach Learn Res, 2008, 9: 1871–1874
MATH Google Scholar
Dong Y X, Chawla N V, Swami A. metapath2vec: scalable representation learning for heterogeneous networks. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2017. 135–144
Goyal P, Ferrara E. Graph embedding techniques, applications, and performance: a survey. Knowledge-Based Syst, 2018, 151: 78–94
Article Google Scholar

Download references

Acknowledgements

This work was supported by National Key R&D Program of China (Grant No. 2016YFB1000103) and National Natural Science Foundation of China (Grant Nos. 61872022, 61772151, 61421003, SKLSDE-2018ZX-16).

Author information

Authors and Affiliations

Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, Beijing, 100083, China
Hao Peng, Jianxin Li & Hao Yan
State Key Laboratory of Software Development Environment, Beihang University, Beijing, 100083, China
Hao Peng, Jianxin Li, Hao Yan, Qiran Gong & Lin Liu
Collage of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China
Senzhang Wang
National Computer Network Emergency Response Technical Team, Coordination Center of China, Beijing, 100029, China
Lihong Wang
Department of Computer Science, University of Southern California, Los Angeles, 90089, USA
Xiang Ren

Authors

Hao Peng
View author publications
You can also search for this author in PubMed Google Scholar
Jianxin Li
View author publications
You can also search for this author in PubMed Google Scholar
Hao Yan
View author publications
You can also search for this author in PubMed Google Scholar
Qiran Gong
View author publications
You can also search for this author in PubMed Google Scholar
Senzhang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Lihong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Ren
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianxin Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Peng, H., Li, J., Yan, H. et al. Dynamic network embedding via incremental skip-gram with negative sampling. Sci. China Inf. Sci. 63, 202103 (2020). https://doi.org/10.1007/s11432-018-9943-9

Download citation

Received: 11 October 2018
Revised: 14 March 2019
Accepted: 10 June 2019
Published: 18 September 2020
DOI: https://doi.org/10.1007/s11432-018-9943-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic network embedding via incremental skip-gram with negative sampling

Abstract

Access this article

Similar content being viewed by others

Evolving network representation learning based on random walks

Representation Learning for Large-Scale Dynamic Networks

Evolving Node Embeddings for Dynamic Exploration of Network Topologies

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dynamic network embedding via incremental skip-gram with negative sampling

Abstract

Access this article

Similar content being viewed by others

Evolving network representation learning based on random walks

Representation Learning for Large-Scale Dynamic Networks

Evolving Node Embeddings for Dynamic Exploration of Network Topologies

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation