Clustered Embedding Learning for Recommender Systems,arXiv - CS - Machine Learning

当前位置： X-MOL 学术 › arXiv.cs.LG › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Clustered Embedding Learning for Recommender Systems
arXiv - CS - Machine Learning Pub Date : 2023-02-03 , DOI: arxiv-2302.01478
Yizhou Chen, Guangda Huzhang, Anxiang Zeng, Qingtao Yu, Hui Sun, Hengyi Li, Jingyi Li, Yabo Ni, Han Yu, Zhiming Zhou

In recent years, recommender systems have advanced rapidly, where embedding learning for users and items plays a critical role. A standard method learns a unique embedding vector for each user and item. However, such a method has two important limitations in real-world applications: 1) it is hard to learn embeddings that generalize well for users and items with rare interactions on their own; and 2) it may incur unbearably high memory costs when the number of users and items scales up. Existing approaches either can only address one of the limitations or have flawed overall performances. In this paper, we propose Clustered Embedding Learning (CEL) as an integrated solution to these two problems. CEL is a plug-and-play embedding learning framework that can be combined with any differentiable feature interaction model. It is capable of achieving improved performance, especially for cold users and items, with reduced memory cost. CEL enables automatic and dynamic clustering of users and items in a top-down fashion, where clustered entities jointly learn a shared embedding. The accelerated version of CEL has an optimal time complexity, which supports efficient online updates. Theoretically, we prove the identifiability and the existence of a unique optimal number of clusters for CEL in the context of nonnegative matrix factorization. Empirically, we validate the effectiveness of CEL on three public datasets and one business dataset, showing its consistently superior performance against current state-of-the-art methods. In particular, when incorporating CEL into the business model, it brings an improvement of $+0.6\%$ in AUC, which translates into a significant revenue gain; meanwhile, the size of the embedding table gets $2650$ times smaller.

中文翻译：

推荐系统的集群嵌入学习

近年来，推荐系统发展迅速，其中为用户和项目嵌入学习起着至关重要的作用。标准方法为每个用户和项目学习一个唯一的嵌入向量。然而，这种方法在实际应用中有两个重要的局限性：1）很难学习对用户和项目本身具有罕见交互的良好泛化的嵌入；2）当用户和项目的数量增加时，可能会产生难以承受的高内存成本。现有方法要么只能解决其中一个局限性，要么整体性能存在缺陷。在本文中，我们提出集群嵌入学习（CEL）作为这两个问题的综合解决方案。CEL 是一种即插即用的嵌入学习框架，可以与任何可区分的特征交互模型相结合。它能够提高性能，特别是对于冷用户和项目，同时降低内存成本。CEL 以自上而下的方式实现用户和项目的自动和动态聚类，其中聚类实体共同学习共享嵌入。CEL 的加速版本具有最佳的时间复杂度，支持高效的在线更新。从理论上讲，我们证明了在非负矩阵分解的情况下 CEL 的可识别性和唯一最优聚类数的存在性。根据经验，我们在三个公共数据集和一个业务数据集上验证了 CEL 的有效性，显示出其与当前最先进方法相比始终如一的卓越性能。特别是，当将 CEL 纳入业务模型时，它带来了 AUC $+0.6\%$ 的提升，这转化为显着的收入收益；同时，嵌入表的大小缩小了 $2650$ 倍。

更新日期：2023-02-06

点击分享查看原文

点击收藏

阅读更多本刊最新论文