当前位置: X-MOL 学术arXiv.cs.AI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Systematic Characterization of Sampling Algorithms for Open-ended Language Generation
arXiv - CS - Artificial Intelligence Pub Date : 2020-09-15 , DOI: arxiv-2009.07243
Moin Nadeem, Tianxing He, Kyunghyun Cho, James Glass

This work studies the widely adopted ancestral sampling algorithms for auto-regressive language models, which is not widely studied in the literature. We use the quality-diversity (Q-D) trade-off to investigate three popular sampling algorithms (top-k, nucleus and tempered sampling). We focus on the task of open-ended language generation. We first show that the existing sampling algorithms have similar performance. After carefully inspecting the transformations defined by different sampling algorithms, we identify three key properties that are shared among them: entropy reduction, order preservation, and slope preservation. To validate the importance of the identified properties, we design two sets of new sampling algorithms: one set in which each algorithm satisfies all three properties, and one set in which each algorithm violates at least one of the properties. We compare their performance with existing sampling algorithms, and find that violating the identified properties could lead to drastic performance degradation, as measured by the Q-D trade-off. On the other hand, we find that the set of sampling algorithms that satisfies these properties performs on par with the existing sampling algorithms. Our data and code are available at https://github.com/moinnadeem/characterizing-sampling-algorithms

中文翻译:

开放式语言生成采样算法的系统特征

这项工作研究了广泛采用的用于自回归语言模型的祖先采样算法,这在文献中并未得到广泛研究。我们使用质量多样性 (QD) 权衡来研究三种流行的采样算法(top-k、核和调和采样)。我们专注于开放式语言生成的任务。我们首先表明现有的采样算法具有相似的性能。在仔细检查由不同采样算法定义的变换之后,我们确定了它们之间共享的三个关键属性:熵减少、顺序保持和斜率保持。为了验证所识别属性的重要性,我们设计了两组新的采样算法:一组每个算法都满足所有三个属性,和一组,其中每个算法至少违反一个属性。我们将它们的性能与现有的采样算法进行了比较,发现违反已识别的属性可能会导致性能急剧下降,正如 QD 权衡所衡量的那样。另一方面,我们发现满足这些属性的采样算法集与现有采样算法的性能相当。我们的数据和代码可在 https://github.com/moinnadeem/characterizing-sampling-algorithms 获得 我们发现满足这些属性的采样算法集与现有采样算法的性能相当。我们的数据和代码可在 https://github.com/moinnadeem/characterizing-sampling-algorithms 获得 我们发现满足这些属性的采样算法集与现有采样算法的性能相当。我们的数据和代码可在 https://github.com/moinnadeem/characterizing-sampling-algorithms 获得
更新日期:2020-09-16
down
wechat
bug