当前位置:
X-MOL 学术
›
arXiv.cs.CL
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Joint System-Wise Optimization for Pipeline Goal-Oriented Dialog System
arXiv - CS - Computation and Language Pub Date : 2021-06-09 , DOI: arxiv-2106.04835 Zichuan Lin, Jing Huang, Bowen Zhou, Xiaodong He, Tengyu Ma
arXiv - CS - Computation and Language Pub Date : 2021-06-09 , DOI: arxiv-2106.04835 Zichuan Lin, Jing Huang, Bowen Zhou, Xiaodong He, Tengyu Ma
Recent work (Takanobu et al., 2020) proposed the system-wise evaluation on
dialog systems and found that improvement on individual components (e.g., NLU,
policy) in prior work may not necessarily bring benefit to pipeline systems in
system-wise evaluation. To improve the system-wise performance, in this paper,
we propose new joint system-wise optimization techniques for the pipeline
dialog system. First, we propose a new data augmentation approach which
automates the labeling process for NLU training. Second, we propose a novel
stochastic policy parameterization with Poisson distribution that enables
better exploration and offers a principled way to compute policy gradient.
Third, we propose a reward bonus to help policy explore successful dialogs. Our
approaches outperform the competitive pipeline systems from Takanobu et al.
(2020) by big margins of 12% success rate in automatic system-wise evaluation
and of 16% success rate in human evaluation on the standard multi-domain
benchmark dataset MultiWOZ 2.1, and also outperform the recent state-of-the-art
end-to-end trained model from DSTC9.
中文翻译:
面向管道目标的对话系统的联合系统优化
最近的工作 (Takanobu et al., 2020) 提出了对对话系统的系统评估,并发现在先前工作中对单个组件(例如 NLU、策略)的改进可能不一定会给系统评估中的管道系统带来好处。为了提高系统性能,在本文中,我们为管道对话系统提出了新的联合系统优化技术。首先,我们提出了一种新的数据增强方法,它可以自动化 NLU 训练的标记过程。其次,我们提出了一种新的具有泊松分布的随机策略参数化,它可以进行更好的探索并提供一种计算策略梯度的原则性方法。第三,我们提出奖励奖金,以帮助政策探索成功的对话。我们的方法优于 Takanobu 等人的竞争管道系统。
更新日期:2021-06-10
中文翻译:
面向管道目标的对话系统的联合系统优化
最近的工作 (Takanobu et al., 2020) 提出了对对话系统的系统评估,并发现在先前工作中对单个组件(例如 NLU、策略)的改进可能不一定会给系统评估中的管道系统带来好处。为了提高系统性能,在本文中,我们为管道对话系统提出了新的联合系统优化技术。首先,我们提出了一种新的数据增强方法,它可以自动化 NLU 训练的标记过程。其次,我们提出了一种新的具有泊松分布的随机策略参数化,它可以进行更好的探索并提供一种计算策略梯度的原则性方法。第三,我们提出奖励奖金,以帮助政策探索成功的对话。我们的方法优于 Takanobu 等人的竞争管道系统。