Variational auto-encoder based Bayesian Poisson tensor factorization for sparse and imbalanced count data

Jin, Yuan; Liu, Ming; Li, Yunfeng; Xu, Ruohua; Du, Lan; Gao, Longxiang; Xiang, Yong

doi:10.1007/s10618-020-00723-7

Variational auto-encoder based Bayesian Poisson tensor factorization for sparse and imbalanced count data

Published: 10 December 2020

Volume 35, pages 505–532, (2021)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Yuan Jin ORCID: orcid.org/0000-0003-4502-8472¹,
Ming Liu²,
Yunfeng Li³,
Ruohua Xu³,
Lan Du¹,
Longxiang Gao² &
…
Yong Xiang²

696 Accesses
2 Citations
4 Altmetric
Explore all metrics

Abstract

Non-negative tensor factorization models enable predictive analysis on count data. Among them, Bayesian Poisson–Gamma models can derive full posterior distributions of latent factors and are less sensitive to sparse count data. However, current inference methods for these Bayesian models adopt restricted update rules for the posterior parameters. They also fail to share the update information to better cope with the data sparsity. Moreover, these models are not endowed with a component that handles the imbalance in count data values. In this paper, we propose a novel variational auto-encoder framework called VAE-BPTF which addresses the above issues. It uses multi-layer perceptron networks to encode and share complex update information. The encoded information is then reweighted per data instance to penalize common data values before aggregated to compute the posterior parameters for the latent factors. Under synthetic data evaluation, VAE-BPTF tended to recover the right number of latent factors and posterior parameter values. It also outperformed current models in both reconstruction errors and latent factor (semantic) coherence across five real-world datasets. Furthermore, the latent factors inferred by VAE-BPTF are perceived to be meaningful and coherent under a qualitative analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scalable Bayesian Tensor Ring Factorization for Multiway Data Analysis

Amortized Variational Inference via Nosé-Hoover Thermostat Hamiltonian Monte Carlo

AmLDA: A Non-VAE Neural Topic Model

Notes

For a more comprehensive review on this subject, we refer readers to Zhang et al. (2019).
For a more detailed mathematical description, we refer readers to Gopalan et al. (2015).
For simplicity, we omitted the activation functions and the bias terms in between.
For simplicity, we omitted the prior shape \(\alpha \) and rate \(\beta \) in Eq. 10 and in Fig. 3. They are not directly used to compute \(\alpha _{uk}\) and \(\beta _{uk}\) in Eqs. 8 and 9. Instead, they are leveraged by the KL regularization in Eq. 5.
The combinations include either softplus or sigmoid for \(\text {h}(\cdot )\), and either them or ReLU for \(\text {q}(\cdot )\).
The symbol \({\mathbb {1}}\) denotes a matrix of the same size as \({\varvec{Y}}\) and contains all ones.
For more details about the exact formulas of TE\((\epsilon _{uk};\alpha _{uk})\) and \(\text {R}\big (\text {log}(\frac{\epsilon _{uk}}{\alpha _{uk}}), \text {log}(\alpha _{uk})\big )\), and their derivation, we refer readers to the supplementary materials of Jankowiak and Obermeyer (2018).
We found that a small positive mean for the latter Normal distribution could stabilize the algorithm right after the initialization compared to a zero mean.
https://s3-us-west-2.amazonaws.com/ai2-s2-research-public/open-corpus/index.html.
https://www.kaggle.com/benhamner/nips-papers.
http://www.shandesitong.com/.
http://jmcauley.ucsd.edu/data/amazon.
The prediction targets in this case are the ratings from the Amazon Prime video review dataset.
https://github.com/aschein/bptf/blob/master/code/bptf.py.
https://github.com/ch237/BayesPoissonFactor/blob/master/PTF_OnlineGibbs.m.
https://www.tensortoolbox.org/cp_apr_doc.html.
The embedding was done based on the entity IDs.
Three zero values per one non-zero values.
We show only the LLs of VAE-BPTF and BPTF as the other Poisson-based models are significantly inferior to them in this aspect.
The NPMI scoring uses a large Wikipedia dump hosted by Palmetto: http://palmetto.aksw.org.
The Game data and the Amazon rating data are not text data, and thus NPMI is not applicable.
We used the cmdscale function in R that implements the classical multi-dimensional scaling.

References

Ahn HJ (2008) A new similarity measure for collaborative filtering to alleviate the new user cold-starting problem. Inf Sci 178(1):37–51
Article Google Scholar
Aletras N, Stevenson M (2013) Evaluating topic coherence using distributional semantics. In: Proceedings of the 10th international conference on computational semantics (IWCS 2013)–Long Papers, pp 13–22
Buntine WL, Mishra S (2014) Experiments with non-parametric topic models. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 881–890
Chi EC, Kolda TG (2012) On tensors, sparsity, and nonnegative factorizations. SIAM J Matrix Anal Appl 33(4):1272–1299
Article MathSciNet Google Scholar
Cox TF, Cox MA (2000) Multidimensional scaling. Chapman and Hall/CRC, Boca Raton
Book Google Scholar
Deng Z, Navarathna R, Carr P, Mandt S, Yue Y, Matthews I, Mori G (2017) Factorized variational autoencoders for modeling audience reactions to movies. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2577–2586
Figurnov M, Mohamed S, Mnih A (2018) Implicit reparameterization gradients. In: Advances in neural information processing systems, pp 441–452
Friedlander MP, Hatz K (2008) Computing non-negative tensor factorizations. Optim Methods Softw 23(4):631–647
Article MathSciNet Google Scholar
Gopalan P, Hofman JM, Blei DM (2015) Scalable recommendation with hierarchical poisson factorization. In: Proceedings of the 31st conference on uncertainty in artificial intelligence, AUAI Press, UAI’15, pp 326–335
He X, Liao L, Zhang H, Nie L, Hu X, Chua TS (2017) Neural collaborative filtering. In: Proceedings of the 26th international conference on world wide web, pp 173–182
He X, Du X, Wang X, Tian F, Tang J, Chua TS (2018) Outer product-based neural collaborative filtering. In: Proceedings of the 27th international joint conference on artificial intelligence, AAAI Press, IJCAI’18, pp 2227–2233
Hidasi B, Karatzoglou A, Baltrunas L, Tikk D (2016) Session-based recommendations with recurrent neural networks. In: Proceedings of the 4th international conference on learning representations, ICLR 2016, San Juan, Puerto Rico, May 2–4, 2016, conference track proceedings
Hinrich JL, Nielsen SFV, Madsen KH, Mørup M (2018) Variational Bayesian partially observed non-negative tensor factorization. In: 2018 IEEE 28th international workshop on machine learning for signal processing (MLSP), pp 1–6. https://doi.org/10.1109/MLSP.2018.8516924
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Hu C, Rai P, Chen C, Harding M, Carin L (2015) Scalable Bayesian non-negative tensor factorization for massive count data. In: Machine learning and knowledge discovery in databases. Springer International Publishing, pp 53–70
Hu Y, Koren Y, Volinsky C (2008) Collaborative filtering for implicit feedback datasets. In: 2008 Eighth IEEE international conference on data mining, IEEE, pp 263–272
Jankowiak M, Obermeyer F (2018) Pathwise derivatives beyond the reparameterization trick. In: International conference on machine learning, pp 2240–2249
Kim D, Park C, Oh J, Lee S, Yu H (2016) Convolutional matrix factorization for document context-aware recommendation. In: Proceedings of the 10th ACM conference on recommender systems, ACM, RecSys ’16, pp 233–240
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Kingma DP, Welling M (2014) Auto-encoding variational bayes. In: Proceedings of the 2nd international conference on learning representations (ICLR)
Knowles DA (2015) Stochastic gradient variational bayes for gamma approximating distributions. arXiv preprint arXiv:1509.01631
Kolda TG, Bader BW (2009) Tensor decompositions and applications. SIAM Rev 51(3):455–500
Article MathSciNet Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems 25, Curran Associates, Inc., pp 1097–1105. http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
Li S, Kawale J, Fu Y (2015) Deep collaborative filtering via marginalized denoising auto-encoder. In: Proceedings of the 24th ACM international on conference on information and knowledge management, ACM, CIKM ’15, pp 811–820
Liu H, Li Y, Tsang M, Liu Y (2019) Costco: a neural tensor completion model for sparse tensors. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining, Association for Computing Machinery, pp 324–334
Rashid AM, Karypis G, Riedl J (2008) Learning preferences of new users in recommender systems: an information theoretic approach. ACM SIGKDD Explor Newsl 10(2):90–100
Article Google Scholar
Schein A, Paisley J, Blei DM, Wallach H (2015) Bayesian Poisson tensor factorization for inferring multilateral relations from sparse dyadic event counts. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 1045–1054
Schein A, Zhou M, Blei DM, Wallach H (2016) Bayesian Poisson tucker decomposition for learning the structure of international relations. In: Proceedings of the 33rd international conference on international conference on machine learning—volume 48, JMLR.org, ICML’16, pp 2810–2819. http://dl.acm.org/citation.cfm?id=3045390.3045686
Schmidt MN, Mohamed S (2009) Probabilistic non-negative tensor factorization using Markov chain Monte Carlo. In: 2009 17th European signal processing conference, IEEE, pp 1918–1922
Sedhain S, Menon AK, Sanner S, Xie L (2015) Autorec: autoencoders meet collaborative filtering. In: Proceedings of the 24th international conference on world wide web, ACM, WWW ’15 Companion, pp 111–112
Shashua A, Hazan T (2005) Non-negative tensor factorization with applications to statistics and computer vision. In: Proceedings of the 22nd international conference on machine learning, ACM, pp 792–799
Welling M, Weber M (2001) Positive tensor factorization. Pattern Recogn Lett 22(12):1255–1261
Article Google Scholar
Wu X, Shi B, Dong Y, Huang C, Chawla NV (2019) Neural tensor factorization for temporal interaction learning. In: Proceedings of the twelfth ACM international conference on web search and data mining, Association for Computing Machinery, New York, NY, USA, WSDM ’19, pp 537–545. https://doi.org/10.1145/3289600.3290998
Xue HJ, Dai X, Zhang J, Huang S, Chen J (2017) Deep matrix factorization models for recommender systems. In: Proceedings of the 26th international joint conference on artificial intelligence, IJCAI-17, pp 3203–3209
Yu Y, Zhang L, Wang C, Gao R, Zhao W Jiang J (2019) Neural personalized ranking via Poisson factor model for item recommendation. Complexity
Zhang S, Yao L, Sun A, Tay Y (2019) Deep learning based recommender system: a survey and new perspectives. ACM Comput Surv 52(1):5:1–5:38
Google Scholar
Zhou M, Hannah L, Dunson D, Carin L (2012) Beta-negative binomial process and poisson factor analysis. In: Proceedings of the 15th international conference on artificial intelligence and statistics, PMLR, Proceedings of Machine Learning Research, vol 22, pp 1462–1471

Download references

Author information

Authors and Affiliations

Faculty of Information Technology, Monash University, Melbourne, VIC, 3800, Australia
Yuan Jin & Lan Du
School of Information Technology, Deakin University, Melbourne, VIC, 3125, Australia
Ming Liu, Longxiang Gao & Yong Xiang
Sandstone Pty Ltd, 32-42 Barker St, Kingsford, 2032, NSW, Australia
Yunfeng Li & Ruohua Xu

Authors

Yuan Jin
View author publications
You can also search for this author in PubMed Google Scholar
Ming Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yunfeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Ruohua Xu
View author publications
You can also search for this author in PubMed Google Scholar
Lan Du
View author publications
You can also search for this author in PubMed Google Scholar
Longxiang Gao
View author publications
You can also search for this author in PubMed Google Scholar
Yong Xiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuan Jin.

Additional information

Responsible editor: Sriraam Natarajan.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jin, Y., Liu, M., Li, Y. et al. Variational auto-encoder based Bayesian Poisson tensor factorization for sparse and imbalanced count data. Data Min Knowl Disc 35, 505–532 (2021). https://doi.org/10.1007/s10618-020-00723-7

Download citation

Received: 29 February 2020
Accepted: 03 November 2020
Published: 10 December 2020
Issue Date: March 2021
DOI: https://doi.org/10.1007/s10618-020-00723-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Variational auto-encoder based Bayesian Poisson tensor factorization for sparse and imbalanced count data

Abstract

Access this article

Similar content being viewed by others

Scalable Bayesian Tensor Ring Factorization for Multiway Data Analysis

Amortized Variational Inference via Nosé-Hoover Thermostat Hamiltonian Monte Carlo

AmLDA: A Non-VAE Neural Topic Model

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Variational auto-encoder based Bayesian Poisson tensor factorization for sparse and imbalanced count data

Abstract

Access this article

Similar content being viewed by others

Scalable Bayesian Tensor Ring Factorization for Multiway Data Analysis

Amortized Variational Inference via Nosé-Hoover Thermostat Hamiltonian Monte Carlo

AmLDA: A Non-VAE Neural Topic Model

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation