当前位置: X-MOL 学术arXiv.cs.IR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Neural Search: Learning Query and Product Representations in Fashion E-commerce
arXiv - CS - Information Retrieval Pub Date : 2021-07-17 , DOI: arxiv-2107.08291
Lakshya Kumar, Sagnik Sarkar

Typical e-commerce platforms contain millions of products in the catalog. Users visit these platforms and enter search queries to retrieve their desired products. Therefore, showing the relevant products at the top is essential for the success of e-commerce platforms. We approach this problem by learning low dimension representations for queries and product descriptions by leveraging user click-stream data as our main source of signal for product relevance. Starting from GRU-based architectures as our baseline model, we move towards a more advanced transformer-based architecture. This helps the model to learn contextual representations of queries and products to serve better search results and understand the user intent in an efficient manner. We perform experiments related to pre-training of the Transformer based RoBERTa model using a fashion corpus and fine-tuning it over the triplet loss. Our experiments on the product ranking task show that the RoBERTa model is able to give an improvement of 7.8% in Mean Reciprocal Rank(MRR), 15.8% in Mean Average Precision(MAP) and 8.8% in Normalized Discounted Cumulative Gain(NDCG), thus outperforming our GRU based baselines. For the product retrieval task, RoBERTa model is able to outperform other two models with an improvement of 164.7% in Precision@50 and 145.3% in Recall@50. In order to highlight the importance of pre-training RoBERTa for fashion domain, we qualitatively compare already pre-trained RoBERTa on standard datasets with our custom pre-trained RoBERTa over a fashion corpus for the query token prediction task. Finally, we also show a qualitative comparison between GRU and RoBERTa results for product retrieval task for some test queries.

中文翻译:

神经搜索:学习时尚电子商务中的查询和产品表示

典型的电子商务平台在目录中包含数百万种产品。用户访问这些平台并输入搜索查询以检索他们想要的产品。因此,在顶部展示相关产品对于电子商务平台的成功至关重要。我们通过利用用户点击流数据作为我们产品相关性的主要信号源来学习查询和产品描述的低维表示来解决这个问题。从基于 GRU 的架构作为我们的基线模型开始,我们转向更高级的基于转换器的架构。这有助于模型学习查询和产品的上下文表示,以提供更好的搜索结果并以有效的方式理解用户意图。我们使用时尚语料库执行与基于 Transformer 的 RoBERTa 模型预训练相关的实验,并在三元组损失上对其进行微调。我们在产品排名任务上的实验表明,RoBERTa 模型能够使平均倒数排名(MRR)提高 7.8%,平均平均精度(MAP)提高 15.8%,归一化折扣累积增益(NDCG)提高 8.8%,因此优于我们基于 GRU 的基线。对于产品检索任务,RoBERTa 模型能够胜过其他两个模型,在 Precision@50 和 Recall@50 中分别提高了 164.7% 和 145.3%。为了强调预训练 RoBERTa 对时尚领域的重要性,我们定性地比较了标准数据集上已经预训练的 RoBERTa 与我们在时尚语料库中针对查询标记预测任务定制的预训练 RoBERTa。最后,
更新日期:2021-07-20
down
wechat
bug