A Comparative Study of Outfit Recommendation Methods with a Focus on Attention-based Fusion,Information Processing & Management

当前位置： X-MOL 学术 › Inf. Process. Manag. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Comparative Study of Outfit Recommendation Methods with a Focus on Attention-based Fusion
Information Processing & Management ( IF 8.6 ) Pub Date : 2020-06-24 , DOI: 10.1016/j.ipm.2020.102316
Katrien Laenen , Marie-Francine Moens

In recent years, deep learning-based recommender systems have received increasing attention, as deep neural networks can detect important product features in images and text descriptions and capture them in semantic vector representations of items. This is especially relevant for outfit recommendation, since a variety of fashion product features play a role in creating outfits. This work is a comparative study of fusion methods for outfit recommendation that combine relevant product features extracted from visual and textual data in semantic, multimodal item representations. We compare traditional fusion methods with attention-based fusion methods, which are designed to focus on the fine-grained product features of items. We evaluate the fusion methods on four benchmark datasets for outfit recommendation and provide insights into the importance of the multimodality and granularity of the fashion item representations. We find that the visual and textual item data not only share product features but also contain complementary product features for the outfit recommendation task, confirming the need to effectively combine them into multimodal item representations. Furthermore, we show that the average performance of attention-based fusion methods surpasses the average performance of traditional fusion methods on three out of the four benchmark datasets, demonstrating the ability of attention to learn relevant correlations among fine-grained fashion attributes.

中文翻译：

以注意力为基础的融合对装备推荐方法的比较研究

近年来，基于深度学习的推荐器系统受到越来越多的关注，因为深度神经网络可以检测图像和文本描述中的重要产品特征，并以项目的语义矢量表示形式捕获它们。这对于服装建议尤其重要，因为各种时尚产品功能在创建服装中都起着作用。这项工作是对服装推荐融合方法的比较研究，该方法将语义和多模式项目表示中从视觉和文本数据中提取的相关产品功能组合在一起。我们将传统的融合方法与基于注意力的融合方法进行了比较，该方法旨在关注商品的细粒度产品特征。我们评估了四个基准数据集上的融合方法以进行服装推荐，并提供了对时尚项目表示形式的多模态和粒度的重要性的见解。我们发现视觉和文本项目数据不仅共享产品功能，而且还包含针对服装推荐任务的补充产品功能，从而确认了将这些数据有效地组合成多模式项目表示形式的必要性。此外，我们显示，基于注意力的融合方法的平均性能在四个基准数据集中的三个数据集上均超过了传统融合方法的平均性能，这表明注意力能够学习细粒度时尚属性之间的相关性。我们发现视觉和文本项目数据不仅共享产品功能，而且还包含针对服装推荐任务的补充产品功能，从而确认了将其有效组合成多模式项目表示形式的必要性。此外，我们显示，基于注意力的融合方法的平均性能在四个基准数据集中的三个数据集上均超过了传统融合方法的平均性能，这表明注意力能够学习细粒度时尚属性之间的相关性。我们发现视觉和文本项目数据不仅共享产品功能，而且还包含针对服装推荐任务的补充产品功能，从而确认了将这些数据有效地组合成多模式项目表示形式的必要性。此外，我们显示，基于注意力的融合方法的平均性能在四个基准数据集中的三个数据集上均超过了传统融合方法的平均性能，这表明注意力能够学习细粒度时尚属性之间的相关性。

更新日期：2020-06-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>