当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Joint System-Wise Optimization for Pipeline Goal-Oriented Dialog System
arXiv - CS - Computation and Language Pub Date : 2021-06-09 , DOI: arxiv-2106.04835
Zichuan Lin, Jing Huang, Bowen Zhou, Xiaodong He, Tengyu Ma

Recent work (Takanobu et al., 2020) proposed the system-wise evaluation on dialog systems and found that improvement on individual components (e.g., NLU, policy) in prior work may not necessarily bring benefit to pipeline systems in system-wise evaluation. To improve the system-wise performance, in this paper, we propose new joint system-wise optimization techniques for the pipeline dialog system. First, we propose a new data augmentation approach which automates the labeling process for NLU training. Second, we propose a novel stochastic policy parameterization with Poisson distribution that enables better exploration and offers a principled way to compute policy gradient. Third, we propose a reward bonus to help policy explore successful dialogs. Our approaches outperform the competitive pipeline systems from Takanobu et al. (2020) by big margins of 12% success rate in automatic system-wise evaluation and of 16% success rate in human evaluation on the standard multi-domain benchmark dataset MultiWOZ 2.1, and also outperform the recent state-of-the-art end-to-end trained model from DSTC9.

中文翻译:

面向管道目标的对话系统的联合系统优化

最近的工作 (Takanobu et al., 2020) 提出了对对话系统的系统评估,并发现在先前工作中对单个组件(例如 NLU、策略)的改进可能不一定会给系统评估中的管道系统带来好处。为了提高系统性能,在本文中,我们为管道对话系统提出了新的联合系统优化技术。首先,我们提出了一种新的数据增强方法,它可以自动化 NLU 训练的标记过程。其次,我们提出了一种新的具有泊松分布的随机策略参数化,它可以进行更好的探索并提供一种计算策略梯度的原则性方法。第三,我们提出奖励奖金,以帮助政策探索成功的对话。我们的方法优于 Takanobu 等人的竞争管道系统。
更新日期:2021-06-10
down
wechat
bug