Skip to main content
Log in

Dynamic network embedding via incremental skip-gram with negative sampling

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Network representation learning, as an approach to learn low dimensional representations of vertices, has attracted considerable research attention recently. It has been proven extremely useful in many machine learning tasks over large graph. Most existing methods focus on learning the structural representations of vertices in a static network, but cannot guarantee an accurate and efficient embedding in a dynamic network scenario. The fundamental problem of continuously capturing the dynamic properties in an efficient way for a dynamic network remains unsolved. To address this issue, we present an efficient incremental skip-gram algorithm with negative sampling for dynamic network embedding, and provide a set of theoretical analyses to characterize the performance guarantee. Specifically, we first partition a dynamic network into the updated, including addition/deletion of links and vertices, and the retained networks over time. Then we factorize the objective function of network embedding into the added, vanished and retained parts of the network. Next we provide a new stochastic gradient-based method, guided by the partitions of the network, to update the nodes and the parameter vectors. The proposed algorithm is proven to yield an objective function value with a bounded difference to that of the original objective function. The first order moment of the objective difference converges in order of \(\mathbb{O}(\frac{1}{n^{2}})\), and the second order moment of the objective difference can be stabilized in order of \(\mathbb{O}(1)\). Experimental results show that our proposal can significantly reduce the training time while preserving the comparable performance. We also demonstrate the correctness of the theoretical analysis and the practical usefulness of the dynamic network embedding. We perform extensive experiments on multiple real-world large network datasets over multi-label classification and link prediction tasks to evaluate the effectiveness and efficiency of the proposed framework, and up to 22 times speedup has been achieved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Hamilton W L, Ying R, Leskovec J. Representation learning on graphs: methods and applications. In: Proceedings of IEEE Data Engineering Bulletin, 2017

  2. Cavallari S, Zheng V W, Cai H Y, et al. Learning community embedding with community detection and node embedding on graphs. In: Proceedings of ACM International Conference on Information and Knowledge Management, 2017. 377–386

  3. Shi C, Hu B B, Zhao W X, et al. Heterogeneous information network embedding for recommendation. IEEE Trans Knowl Data Eng, 2019, 31: 357–370

    Article  Google Scholar 

  4. Hu R J, Aggarwal C C, Ma S, et al. An embedding approach to anomaly detection. In: Proceedings of 2016 IEEE 32nd International Conference on Data Engineering (ICDE), Helsinki, 2016. 385–396

  5. Grover A, Leskovec J. node2vec: scalable feature learning for networks. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2016. 855–864

  6. Perozzi B, Al-Rfou R, Skiena S. Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2014. 701–710

    Chapter  Google Scholar 

  7. Tang J, Qu M, Wang M Z, et al. Line: large-scale information network embedding. In: Proceedings of International World Wide Web Conference, 2015. 1067–1077

  8. Tang J, Qu M, Mei Q Z. Pte: predictive text embedding through large-scale heterogeneous text networks. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2015. 1165–1174

  9. He Y, Li J X, Song Y Q, et al. Time-evolving text classification with deep neural networks. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, 2018. 2241–2247

  10. Ren X, He W Q, Qu M, et al. Label noise reduction in entity typing by heterogeneous partial-label embedding. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2016. 1825–1834

  11. Wang D X, Cui P, Zhu W W. Structural deep network embedding. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2016. 1225–1234

  12. Cui P, Wang X, Pei J, et al. A survey on network embedding. IEEE Trans Knowl Data Eng, 2019, 31: 833–852

    Article  Google Scholar 

  13. Li C Z, Wang S Z, Yang D J, et al. PPNE: property preserving network embedding. In: Proceedings of International Conference on Database Systems for Advanced Applications. Berlin: Springer, 2017. 163–179

    Chapter  Google Scholar 

  14. Yang D J, Wang S Z, Li C Z, et al. From properties to links: deep network embedding on incomplete graphs. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. New York: ACM, 2017. 367–376

    Chapter  Google Scholar 

  15. Zhang H M, Qiu L W, Yi L L, et al. Scalable multiplex network embedding. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, 2018. 3082–3088

  16. Trivedi R, Dai H J, Wang Y C, et al. Know-evolve: deep temporal reasoning for dynamic knowledge graphs. In: Proceedings of International Conference on Machine Learning, 2017. 3462–3471

  17. Zuo Y, Liu G N, Lin H, et al. Embedding temporal network via neighborhood formation. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2018. 2857–2866

  18. Zhu L H, Guo D, Yin J M, et al. Scalable temporal latent space inference for link prediction in dynamic social networks. IEEE Trans Knowl Data Eng, 2016, 28: 2765–2777

    Article  Google Scholar 

  19. Hamilton W L, Ying R, Leskovec J. Inductive representation learning on large graphs. In: Proceedings of Annual Conference on Neural Information Processing Systems, 2017

  20. Chen J F, Zhang Q, Huang X J. Incorporate group information to enhance network embedding. In: Proceedings of ACM International Conference on Information and Knowledge Management, 2016. 1901–1904

  21. Cao S S, Lu W, Xu Q K. Grarep: learning graph representations with global structural information. In: Proceedings of ACM International Conference on Information and Knowledge Management, 2015. 891–900

  22. Yang C, Sun M S, Liu Z Y, et al. Fast network embedding enhancement via high order proximity approximation. In: Proceedings of International Joint Conference on Artificial Intelligence, 2017

  23. Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality. In: Proceedings of Annual Conference on Neural Information Processing Systems, 2013. 1–9

  24. Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space. 2013. arXiv: 1301.3781

  25. Morin F, Bengio Y. Hierarchical probabilistic neural network language model. In: Proceedings of International Conference on Artificial Intelligence and Statistics, 2005, 5: 246–252

    Google Scholar 

  26. Gutmann M U, Hyvarinen A. Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. J Mach Learn Res, 2012, 13: 307–361

    MathSciNet  MATH  Google Scholar 

  27. Levy O, Goldberg Y. Neural word embedding as implicit matrix factorization. In: Proceedings of Advances in Neural Information Processing Systems 27 (NIPS 2014), 2014

  28. Mnih A, Teh Y W. A fast and simple algorithm for training neural probabilistic language models. In: Proceedings of the 29th International Coference on Machine Learning, 2012. 419–426

  29. Ribeiro L F R, Saverese P H P, Figueiredo D R. struc2vec: learning node representations from structural identity. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2017. 385–394

  30. Donnat C, Zitnik M, Hallac D, et al. Learning structural node embeddings via diffusion wavelets. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2018. 1320–1329

  31. Li J D, Dani H, Hu X, et al. Attributed network embedding for learning in a dynamic environment. In: Proceedings of ACM International Conference on Information and Knowledge Management, 2017. 387–396

  32. Jian L, Li J D, Liu H. Toward online node classification on streaming networks. In: Proceedings of International Conference on Data Mining and Knowledge Discovery, 2018. 231–257

  33. Xu K S, Hero A O. Dynamic stochastic blockmodels: statistical models for time-evolving networks. In: Proceedings of International Conference on Social Computing, Behavioral-Cultural Modeling & Prediction and Behavior Representation in Modeling and Simulation, 2013. 201–210

  34. Zhou L, Yang Y, Ren X, et al. Dynamic network embedding by modelling triadic closure process. In: Proceedings of AAAI Conference on Artificial Intelligence, 2018

  35. Du L, Wang Y, Song G J, et al. Dynamic network embedding: an extended approach for skip-gram based network embedding. In: Proceedings of International Joint Conference on Artificial Intelligence, 2018. 2086–2092

  36. Peng H, Li J X, Song Y Q, et al. Incrementally learning the hierarchical softmax function for neural language models. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017

  37. Kaji N, Kobayashi H. Incremental skip-gram model with negative sampling. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, 2017

  38. Rudolph M, Blei D. Dynamic embeddings for language evolution. In: Proceedings of International World Wide Web Conferences Steering Committee, 2018. 1003–1011

  39. Peng H, Bao M J, Li J X, et al. Incremental term representation learning for social network analysis. Future Generation Comput Syst, 2018, 86: 1503–1512

    Article  Google Scholar 

  40. Barbier G, Liu H. Data mining in social media. In: Social network data analytics. Boston: Springer, 2011. 327–352

    Chapter  Google Scholar 

  41. Tang L, Liu H. Relational learning via latent social dimensions. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2009. 817–826

  42. Leskovec J, Mcauley J J. Learning to discover social circles in ego networks. In: Proceedings of Annual Conference on Neural Information Processing Systems, 2012

  43. Leskovec J, Kleinberg J, Faloutsos C. Graph evolution: densification and shrinking diameters. ACM Trans Knowl Discov Data, 2007, 1: 2

    Article  Google Scholar 

  44. Yang J, Leskovec J. Defining and evaluating network communities based on ground-truth. In: Proceedings of ACM SIGKDD Workshop on Mining Data Semantics, 2012. 181–213

  45. Fan R E, Chang K W, Hsieh C J, et al. Liblinear: a library for large linear classification. J Mach Learn Res, 2008, 9: 1871–1874

    MATH  Google Scholar 

  46. Dong Y X, Chawla N V, Swami A. metapath2vec: scalable representation learning for heterogeneous networks. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2017. 135–144

  47. Goyal P, Ferrara E. Graph embedding techniques, applications, and performance: a survey. Knowledge-Based Syst, 2018, 151: 78–94

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by National Key R&D Program of China (Grant No. 2016YFB1000103) and National Natural Science Foundation of China (Grant Nos. 61872022, 61772151, 61421003, SKLSDE-2018ZX-16).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianxin Li.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Peng, H., Li, J., Yan, H. et al. Dynamic network embedding via incremental skip-gram with negative sampling. Sci. China Inf. Sci. 63, 202103 (2020). https://doi.org/10.1007/s11432-018-9943-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-018-9943-9

Keywords

Navigation