当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Better Variant of Self-Critical Sequence Training
arXiv - CS - Computation and Language Pub Date : 2020-03-22 , DOI: arxiv-2003.09971
Ruotian Luo

In this work, we present a simple yet better variant of Self-Critical Sequence Training. We make a simple change in the choice of baseline function in REINFORCE algorithm. The new baseline can bring better performance with no extra cost, compared to the greedy decoding baseline.

中文翻译:

自我批评序列训练的更好变体

在这项工作中,我们提出了一种简单但更好的自我批评序列训练变体。我们在 REINFORCE 算法中对基线函数的选择做了一个简单的改变。与贪婪解码基线相比,新基线可以在不增加成本的情况下带来更好的性能。
更新日期:2020-05-12
down
wechat
bug