当前位置: X-MOL 学术arXiv.cs.IR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Controlling the Risk of Conversational Search via Reinforcement Learning
arXiv - CS - Information Retrieval Pub Date : 2021-01-15 , DOI: arxiv-2101.06327
Zhenduo Wang, Qingyao Ai

Users often formulate their search queries with immature language without well-developed keywords and complete structures. Such queries fail to express their true information needs and raise ambiguity as fragmental language often yield various interpretations and aspects. This gives search engines a hard time processing and understanding the query, and eventually leads to unsatisfactory retrieval results. An alternative approach to direct answer while facing an ambiguous query is to proactively ask clarifying questions to the user. Recent years have seen many works and shared tasks from both NLP and IR community about identifying the need for asking clarifying question and methodology to generate them. An often neglected fact by these works is that although sometimes the need for clarifying questions is correctly recognized, the clarifying questions these system generate are still off-topic and dissatisfaction provoking to users and may just cause users to leave the conversation. In this work, we propose a risk-aware conversational search agent model to balance the risk of answering user's query and asking clarifying questions. The agent is fully aware that asking clarifying questions can potentially collect more information from user, but it will compare all the choices it has and evaluate the risks. Only after then, it will make decision between answering or asking. To demonstrate that our system is able to retrieve better answers, we conduct experiments on the MSDialog dataset which contains real-world customer service conversations from Microsoft products community. We also purpose a reinforcement learning strategy which allows us to train our model on the original dataset directly and saves us from any further data annotation efforts. Our experiment results show that our risk-aware conversational search agent is able to significantly outperform strong non-risk-aware baselines.

中文翻译:

通过强化学习控制会话搜索的风险

用户经常使用不成熟的语言来制定搜索查询,而没有完善的关键字和完整的结构。这样的查询无法表达其真正的信息需求,并且由于片段性语言经常会产生各种解释和方面,因而引起歧义。这使搜索引擎难以处理和理解查询,最终导致检索结果不理想。面对模棱两可的查询时,直接回答的另一种方法是主动向用户询问问题。近年来,来自NLP和IR社区的许多工作和共同任务涉及确定是否需要提出澄清问题和方法来生成它们。这些作品经常被忽略的事实是,尽管有时正确认识到澄清问题的必要性,这些系统产生的澄清性问题仍然是话题外的,并且引起用户不满,并且可能仅导致用户退出对话。在这项工作中,我们提出了一种风险感知型会话搜索代理模型,以平衡回答用户查询和提出澄清问题的风险。代理完全意识到,提出澄清问题可以潜在地从用户那里收集更多信息,但是它将比较所有选择并评估风险。只有到那时,它才会在回答或询问之间做出决定。为了证明我们的系统能够检索到更好的答案,我们对MSDialog数据集进行了实验,该数据集包含来自Microsoft产品社区的真实世界的客户服务对话。我们还制定了强化学习策略,该策略使我们能够直接在原始数据集上训练模型,从而避免了进一步的数据注释工作。我们的实验结果表明,我们的具有风险意识的会话搜索代理能够大大胜过强大的非风险意识基准。
更新日期:2021-01-19
down
wechat
bug