当前位置: X-MOL 学术arXiv.cs.IR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Leaf-FM: A Learnable Feature Generation Factorization Machine for Click-Through Rate Prediction
arXiv - CS - Information Retrieval Pub Date : 2021-07-26 , DOI: arxiv-2107.12024
Qingyun She, Zhiqiang Wang, Junlin Zhang

Click-through rate (CTR) prediction plays important role in personalized advertising and recommender systems. Though many models have been proposed such as FM, FFM and DeepFM in recent years, feature engineering is still a very important way to improve the model performance in many applications because using raw features can rarely lead to optimal results. For example, the continuous features are usually transformed to the power forms by adding a new feature to allow it to easily form non-linear functions of the feature. However, this kind of feature engineering heavily relies on peoples experience and it is both time consuming and labor consuming. On the other side, concise CTR model with both fast online serving speed and good model performance is critical for many real life applications. In this paper, we propose LeafFM model based on FM to generate new features from the original feature embedding by learning the transformation functions automatically. We also design three concrete Leaf-FM models according to the different strategies of combing the original and the generated features. Extensive experiments are conducted on three real-world datasets and the results show Leaf-FM model outperforms standard FMs by a large margin. Compared with FFMs, Leaf-FM can achieve significantly better performance with much less parameters. In Avazu and Malware dataset, add version Leaf-FM achieves comparable performance with some deep learning based models such as DNN and AutoInt. As an improved FM model, Leaf-FM has the same computation complexity with FM in online serving phase and it means Leaf-FM is applicable in many industry applications because of its better performance and high computation efficiency.

中文翻译:

Leaf-FM:用于点击率预测的可学习特征生成分解机

点击率 (CTR) 预测在个性化广告和推荐系统中起着重要作用。尽管近年来提出了许多模型,例如 FM、FFM 和 DeepFM,但特征工程仍然是许多应用中提高模型性能的非常重要的方法,因为使用原始特征很少能带来最佳结果。例如,通常通过添加新特征将连续特征转换为幂形式,以使其易于形成特征的非线性函数。但是,这种特征工程非常依赖人的经验,既费时又费力。另一方面,具有快速在线服务速度和良好模型性能的简洁 CTR 模型对于许多现实生活应用程序至关重要。在本文中,我们提出了基于 FM 的 LeafFM 模型,通过自动学习转换函数从原始特征嵌入生成新特征。我们还根据结合原始特征和生成特征的不同策略设计了三个具体的 Leaf-FM 模型。在三个真实世界的数据集上进行了广泛的实验,结果表明 Leaf-FM 模型大大优于标准 FM。与 FFM 相比,Leaf-FM 可以以更少的参数获得更好的性能。在 Avazu 和 Malware 数据集中,添加版本 Leaf-FM 实现了与一些基于深度学习的模型(如 DNN 和 AutoInt)相当的性能。作为改进的 FM 模型,
更新日期:2021-07-27
down
wechat
bug