Evaluation of Text Generation: A Survey,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Evaluation of Text Generation: A Survey
arXiv - CS - Computation and Language Pub Date : 2020-06-26 , DOI: arxiv-2006.14799
Asli Celikyilmaz, Elizabeth Clark, Jianfeng Gao

The paper surveys evaluation methods of natural language generation (NLG) systems that have been developed in the last few years. We group NLG evaluation methods into three categories: (1) human-centric evaluation metrics, (2) automatic metrics that require no training, and (3) machine-learned metrics. For each category, we discuss the progress that has been made and the challenges still being faced, with a focus on the evaluation of recently proposed NLG tasks and neural NLG models. We then present two case studies of automatic text summarization and long text generation, and conclude the paper by proposing future research directions.

中文翻译：

文本生成评估：一项调查

该论文调查了过去几年开发的自然语言生成 (NLG) 系统的评估方法。我们将 NLG 评估方法分为三类：（1）以人为中心的评估指标，（2）不需要训练的自动指标，以及（3）机器学习的指标。对于每个类别，我们讨论已经取得的进展和仍然面临的挑战，重点是对最近提出的 NLG 任务和神经 NLG 模型的评估。然后，我们介绍了自动文本摘要和长文本生成的两个案例研究，并通过提出未来的研究方向来总结论文。

更新日期：2020-06-29

点击分享查看原文

点击收藏

阅读更多本刊最新论文