Mining Latent Structures for Multimedia Recommendation,arXiv - CS - Multimedia

当前位置： X-MOL 学术 › arXiv.cs.MM › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Mining Latent Structures for Multimedia Recommendation
arXiv - CS - Multimedia Pub Date : 2021-04-19 , DOI: arxiv-2104.09036
Jinghao Zhang, Yanqiao Zhu, Qiang Liu, Shu Wu, Shuhui Wang, Liang Wang

Multimedia content is of predominance in the modern Web era. Investigating how users interact with multimodal items is a continuing concern within the rapid development of recommender systems. The majority of previous work focuses on modeling user-item interactions with multimodal features included as side information. However, this scheme is not well-designed for multimedia recommendation. Specifically, only collaborative item-item relationships are implicitly modeled through high-order item-user-item relations. Considering that items are associated with rich contents in multiple modalities, we argue that the latent item-item structures underlying these multimodal contents could be beneficial for learning better item representations and further boosting recommendation. To this end, we propose a LATent sTructure mining method for multImodal reCommEndation, which we term LATTICE for brevity. To be specific, in the proposed LATTICE model, we devise a novel modality-aware structure learning layer, which learns item-item structures for each modality and aggregates multiple modalities to obtain latent item graphs. Based on the learned latent graphs, we perform graph convolutions to explicitly inject high-order item affinities into item representations. These enriched item representations can then be plugged into existing collaborative filtering methods to make more accurate recommendations. Extensive experiments on three real-world datasets demonstrate the superiority of our method over state-of-the-art multimedia recommendation methods and validate the efficacy of mining latent item-item relationships from multimodal features.

中文翻译：

多媒体推荐的挖掘潜在结构

多媒体内容在现代网络时代占主导地位。在推荐系统的快速发展中，调查用户如何与多式联运商品互动一直是一个持续关注的问题。先前的大部分工作都集中在对用户-项目交互进行建模，并将多模式功能作为辅助信息包含在内。但是，该方案不是为多媒体推荐精心设计的。具体而言，只有协作项目-项目关系才通过高阶项目-用户-项目关系隐式建模。考虑到项目与多种模式中的丰富内容相关联，我们认为这些多模式内容所基于的潜在项目结构可能有利于学习更好的项目表示形式并进一步提高推荐率。为此，我们为多模式推荐提出了一种LATent结构挖掘方法，为简洁起见，我们将其称为LATTICE。具体来说，在提出的LATTICE模型中，我们设计了一个新颖的模态感知结构学习层，该层学习每种模态的项-项结构并聚合多个模态以获得潜在项图。基于学习到的潜在图，我们执行图卷积以将高阶项亲和力明确注入项表示中。然后，可以将这些丰富的项目表示形式插入现有的协作过滤方法中，以提出更准确的建议。在三个真实世界的数据集上进行的大量实验证明了我们的方法优于最新的多媒体推荐方法的优势，并验证了从多模式特征中挖掘潜在项目与项目之间的关系的有效性。

更新日期：2021-04-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文