当前位置: X-MOL 学术Expert Syst. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Towards integrated dialogue policy learning for multiple domains and intents using Hierarchical Deep Reinforcement Learning
Expert Systems with Applications ( IF 7.5 ) Pub Date : 2020-06-27 , DOI: 10.1016/j.eswa.2020.113650
Tulika Saha , Dhawal Gupta , Sriparna Saha , Pushpak Bhattacharyya

Creation of Expert and Intelligent Dialogue/Virtual Agent (VA) that can serve complicated and intricate tasks (need) of the user related to multiple domains and its various intents is indeed quite challenging as it necessitates the agent to concurrently handle multiple subtasks in different domains. This paper presents an expert, unified and a generic Deep Reinforcement Learning (DRL) framework that creates dialogue managers competent for managing task-oriented conversations embodying multiple domains along with their various intents and provide the user with an expert system which is a one stop for all queries. In order to address these multiple aspects, the dialogue exchange between the user and the VA is split into hierarchies, so as to isolate and identify subtasks belonging to different domains. The notion of Hierarchical Reinforcement Learning (HRL) specifically options is employed to learn optimal policies in these hierarchies that operate at varying time steps to accomplish the user goal. The dialogue manager encompasses a top-level domain meta-policy, intermediate-level intent meta-policies in order to select amongst varied and multiple subtasks or options and low-level controller policies to select primitive actions to complete the subtask given by the higher-level meta-policies in varying intents and domains. Sharing of controller policies among overlapping subtasks enables the meta-policies to be generic. The proposed expert framework has been demonstrated in the domains of “Air Travel” and “Restaurant”. Experiments as compared to several strong baselines and a state of the art model establish the efficiency of the learned policies and the need for such expert models capable of handling complex and composite tasks.



中文翻译:

使用分层深度强化学习实现针对多个领域和意图的集成对话策略学习

创建能够服务于涉及多个域及其各种意图的用户的复杂任务(专家)的专家和智能对话/虚拟代理(VA)确实具有很大的挑战性,因为它需要代理同时处理不同域中的多个子任务。本文提出了一个专家,统一和通用的深度强化学习(DRL)框架,该框架创建了对话管理器,可胜任管理面向任务的对话,这些对话体现了多个领域及其各种意图,并为用户提供了一个专家系统,这是一站式服务所有查询。为了解决这些多个方面,将用户和VA之间的对话交换划分为层次结构,以便隔离和识别属于不同域的子任务。分层强化学习(HRL)的概念特别是选项在不同的时间步长以实现用户目标的这些层次结构中,使用管理员来学习最佳策略。对话管理器包含一个顶级域元策略,中级意图元策略,以便在变化的多个子任务或选项中进行选择,以及低级控制器策略以选择原始操作来完成更高级别的子任务所赋予的子任务。不同意图和领域的层级元政策。在重叠的子任务之间共享控制器策略可以使元策略变得通用。拟议的专家框架已在“航空旅行”和“餐厅”领域得到证明。

更新日期:2020-06-27
down
wechat
bug