Joint System-Wise Optimization for Pipeline Goal-Oriented Dialog System,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Joint System-Wise Optimization for Pipeline Goal-Oriented Dialog System
arXiv - CS - Computation and Language Pub Date : 2021-06-09 , DOI: arxiv-2106.04835
Zichuan Lin, Jing Huang, Bowen Zhou, Xiaodong He, Tengyu Ma

Recent work (Takanobu et al., 2020) proposed the system-wise evaluation on dialog systems and found that improvement on individual components (e.g., NLU, policy) in prior work may not necessarily bring benefit to pipeline systems in system-wise evaluation. To improve the system-wise performance, in this paper, we propose new joint system-wise optimization techniques for the pipeline dialog system. First, we propose a new data augmentation approach which automates the labeling process for NLU training. Second, we propose a novel stochastic policy parameterization with Poisson distribution that enables better exploration and offers a principled way to compute policy gradient. Third, we propose a reward bonus to help policy explore successful dialogs. Our approaches outperform the competitive pipeline systems from Takanobu et al. (2020) by big margins of 12% success rate in automatic system-wise evaluation and of 16% success rate in human evaluation on the standard multi-domain benchmark dataset MultiWOZ 2.1, and also outperform the recent state-of-the-art end-to-end trained model from DSTC9.

中文翻译：

面向管道目标的对话系统的联合系统优化

最近的工作 (Takanobu et al., 2020) 提出了对对话系统的系统评估，并发现在先前工作中对单个组件（例如 NLU、策略）的改进可能不一定会给系统评估中的管道系统带来好处。为了提高系统性能，在本文中，我们为管道对话系统提出了新的联合系统优化技术。首先，我们提出了一种新的数据增强方法，它可以自动化 NLU 训练的标记过程。其次，我们提出了一种新的具有泊松分布的随机策略参数化，它可以进行更好的探索并提供一种计算策略梯度的原则性方法。第三，我们提出奖励奖金，以帮助政策探索成功的对话。我们的方法优于 Takanobu 等人的竞争管道系统。

更新日期：2021-06-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文