当前位置: X-MOL 学术arXiv.cs.IR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Disentangled Self-Attentive Neural Networks for Click-Through Rate Prediction
arXiv - CS - Information Retrieval Pub Date : 2021-01-11 , DOI: arxiv-2101.03654
Yanqiao Zhu, Yichen Xu, Feng Yu, Qiang Liu, Shu Wu, Liang Wang

Click-through rate (CTR) prediction, which aims to predict the probability that whether of a user will click on an item, is an essential task for many online applications. Due to the nature of data sparsity and high dimensionality in CTR prediction, a key to making effective prediction is to model high-order feature interactions among feature fields. To explicitly model high-order feature interactions, an efficient way is to stack multihead self-attentive neural networks, which has achieved promising performance. However, one problem of the vanilla self-attentive network is that two terms, a whitened pairwise interaction term and a unary term, are coupled in the computation of the self-attention score, where the pairwise term contributes to learning the importance score for each feature interaction, while the unary term models the impact of one feature on all other features. We identify two factors, coupled gradient computation and shared transformations, impede the learning of both terms. To solve this problem, in this paper,we present a novel Disentangled Self-Attentive neural Network (DSAN) model for CTR prediction, which disentangles the two terms for facilitating learning feature interactions. We conduct extensive experiments framework using two real-world benchmark datasets. The results show that DSAN not only retains computational efficiency but obtains performance improvements over state-of-the-art baselines.

中文翻译:

解缠的自专心神经网络用于点击率预测

点击率(CTR)预测旨在预测用户是否会点击某个项目的可能性,这是许多在线应用程序的一项基本任务。由于CTR预测中的数据稀疏性和高维性,进行有效预测的关键是对特征字段之间的高阶特征交互进行建模。为了显式地建模高阶特征交互,一种有效的方法是堆叠多头自注意神经网络,该网络已经取得了令人鼓舞的性能。但是,香草自专心网络的一个问题是,在自我注意力得分的计算中,将两个项(变白的成对互动项和一元项)耦合在一起,其中成对项有助于学习每个重要性项功能互动,一元术语则模拟了一项功能对所有其他功能的影响。我们确定了两个因素,耦合梯度计算和共享变换,阻碍了这两个术语的学习。为了解决这个问题,在本文中,我们提出了一种新颖的解散自注意神经网络(DSAN)模型来进行CTR预测,该模型解开了用于促进学习特征交互作用的两个术语。我们使用两个真实的基准数据集进行了广泛的实验框架。结果表明,DSAN不仅保留了计算效率,而且在最先进的基准上获得了性能上的提高。我们提出了一种新颖的Distentangled自专心神经网络(DSAN)模型来进行CTR预测,它可以分解两个术语以促进学习功能的交互。我们使用两个真实的基准数据集进行了广泛的实验框架。结果表明,DSAN不仅保留了计算效率,而且在最先进的基准上获得了性能上的提高。我们提出了一种新颖的Distentangled自专心神经网络(DSAN)模型来进行CTR预测,它可以分解两个术语以促进学习功能的交互。我们使用两个真实的基准数据集进行了广泛的实验框架。结果表明,DSAN不仅保留了计算效率,而且在最先进的基准上获得了性能改进。
更新日期:2021-01-12
down
wechat
bug