Towards Productionizing Subjective Search Systems,arXiv - CS - Information Retrieval

当前位置： X-MOL 学术 › arXiv.cs.IR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Towards Productionizing Subjective Search Systems
arXiv - CS - Information Retrieval Pub Date : 2020-03-31 , DOI: arxiv-2003.13968
Aaron Feng, Shuwei Chen, Yuliang Li, Hiroshi Matsuda, Hidekazu Tamaki, Wang-Chiew Tan

Existing e-commerce search engines typically support search only over objective attributes, such as price and locations, leaving the more desirable subjective attributes, such as romantic vibe and worklife balance unsearchable. We found that this is also the case for Recruit Group, which operates a wide range of online booking and search services, including jobs, travel, housing, bridal, dining, beauty, and where each service is among the biggest in Japan, if not internationally. We present our progress towards productionizing a recent subjective search prototype (OpineDB) developed by Megagon Labs for Recruit Group. Several components within OpineDB are enhanced to satisfy production demands, including adding a BERT language model pre-trained on massive hospitality domain review corpora. We also found that the challenges of productionizing the system are beyond enhancing the components. In particular, an important requirement in production-quality systems is to instrument a proper way of measuring the search quality, which is extremely tricky when the search results are subjective. This led to the creation of a high-quality benchmark dataset from scratch, involving over 600 queries by user interviews and a collection of more than 120,000 query-entity relevancy labels. Also, we found that the existing search algorithms do not meet the search quality standard required by production systems. Consequently, we enhanced the ranking model by fine-tuning several search algorithms and combining them under a learning-to-rank framework. The model achieves 5%-10% overall precision improvement and 90+% precision on more than half of the benchmark testing queries making these queries ready for AB-testing. While some enhancements can be immediately applied to other verticals, our experience reveals that benchmarking and fine-tuning ranking algorithms are specific to each domain and cannot be avoided.

中文翻译：

走向生产化主观搜索系统

现有的电子商务搜索引擎通常只支持搜索客观属性，例如价格和位置，而无法搜索更理想的主观属性，例如浪漫氛围和工作生活平衡。我们发现 Recruit Group 的情况也是如此，该公司经营范围广泛的在线预订和搜索服务，包括工作、旅行、住房、新娘、餐饮、美容，并且每项服务都在日本名列前茅（如果不是）国际上。我们展示了我们在生产由 Megagon Labs 为 Recruit Group 开发的最新主观搜索原型 (OpineDB) 方面的进展。OpineDB 中的几个组件得到增强以满足生产需求，包括添加一个在大量酒店领域评论语料库上预先训练的 BERT 语言模型。我们还发现，生产系统的挑战不仅仅是增强组件。特别是，生产质量系统中的一个重要要求是采用适当的方法来衡量搜索质量，当搜索结果是主观的时，这非常棘手。这导致从头开始创建高质量的基准数据集，涉及用户访谈的 600 多个查询和超过 120,000 个查询实体相关性标签的集合。此外，我们发现现有的搜索算法不符合生产系统要求的搜索质量标准。因此，我们通过微调几种搜索算法并将它们组合在学习排名框架下来增强排名模型。该模型在超过一半的基准测试查询上实现了 5%-10% 的整体精度提升和 90% 以上的精度，使这些查询为 AB 测试做好了准备。虽然一些增强功能可以立即应用于其他垂直领域，但我们的经验表明，基准测试和微调排名算法特定于每个领域，无法避免。

更新日期：2020-04-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文