当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
F^2-Softmax: Diversifying Neural Text Generation via Frequency Factorized Softmax
arXiv - CS - Computation and Language Pub Date : 2020-09-20 , DOI: arxiv-2009.09417
Byung-Ju Choi, Jimin Hong, David Keetae Park, Sang Wan Lee

Despite recent advances in neural text generation, encoding the rich diversity in human language remains elusive. We argue that the sub-optimal text generation is mainly attributable to the imbalanced token distribution, which particularly misdirects the learning model when trained with the maximum-likelihood objective. As a simple yet effective remedy, we propose two novel methods, F^2-Softmax and MefMax, for a balanced training even with the skewed frequency distribution. MefMax assigns tokens uniquely to frequency classes, trying to group tokens with similar frequencies and equalize frequency mass between the classes. F^2-Softmax then decomposes a probability distribution of the target token into a product of two conditional probabilities of (i) frequency class, and (ii) token from the target frequency class. Models learn more uniform probability distributions because they are confined to subsets of vocabularies. Significant performance gains on seven relevant metrics suggest the supremacy of our approach in improving not only the diversity but also the quality of generated texts.

中文翻译:

F^2-Softmax:通过频率分解 Softmax 使神经文本生成多样化

尽管最近在神经文本生成方面取得了进展,但对人类语言中丰富的多样性进行编码仍然难以捉摸。我们认为,次优文本生成主要归因于令牌分布不平衡,当使用最大似然目标进行训练时,这尤其会误导学习模型。作为一种简单而有效的补救措施,我们提出了两种新方法 F^2-Softmax 和 MefMax,即使在频率分布偏斜的情况下也能进行平衡训练。MefMax 将标记唯一地分配给频率类别,尝试将具有相似频率的标记分组并均衡类别之间的频率质量。F^2-Softmax 然后将目标标记的概率分布分解为 (i) 频率类别和 (ii) 来自目标频率类别的标记的两个条件概率的乘积。模型学习更均匀的概率分布,因为它们仅限于词汇表的子集。七个相关指标的显着性能提升表明我们的方法不仅在提高多样性而且在提高生成文本的质量方面具有至高无上的地位。
更新日期:2020-10-06
down
wechat
bug