Survey of Hallucination in Natural Language Generation,ACM Computing Surveys

当前位置： X-MOL 学术 › ACM Comput. Surv. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Survey of Hallucination in Natural Language Generation
ACM Computing Surveys ( IF 23.8 ) Pub Date : 2023-03-03 , DOI: 10.1145/3571730
Ziwei Ji , Nayeon Lee , Rita Frieske , Tiezheng Yu , Dan Su , Yan Xu , Etsuko Ishii , Yejin Bang , Andrea Madotto , Pascale Fung ₁

Affiliation

Natural Language Generation (NLG) has improved exponentially in recent years thanks to the development of sequence-to-sequence deep learning technologies such as Transformer-based language models. This advancement has led to more fluent and coherent NLG, leading to improved development in downstream tasks such as abstractive summarization, dialogue generation, and data-to-text generation. However, it is also apparent that deep learning based generation is prone to hallucinate unintended text, which degrades the system performance and fails to meet user expectations in many real-world scenarios. To address this issue, many studies have been presented in measuring and mitigating hallucinated texts, but these have never been reviewed in a comprehensive manner before.

In this survey, we thus provide a broad overview of the research progress and challenges in the hallucination problem in NLG. The survey is organized into two parts: (1) a general overview of metrics, mitigation methods, and future directions, and (2) an overview of task-specific research progress on hallucinations in the following downstream tasks, namely abstractive summarization, dialogue generation, generative question answering, data-to-text generation, and machine translation. This survey serves to facilitate collaborative efforts among researchers in tackling the challenge of hallucinated texts in NLG.

中文翻译：

自然语言生成中的幻觉调查

由于序列到序列深度学习技术（例如基于 Transformer 的语言模型）的发展，自然语言生成 (NLG) 近年来呈指数级增长。这一进步导致了更流畅和连贯的 NLG，从而改进了下游任务的发展，例如抽象摘要、对话生成和数据到文本生成。然而，同样明显的是，基于深度学习的生成容易产生幻觉的无意文本，这会降低系统性能，并且在许多现实场景中无法满足用户期望。为了解决这个问题，已经提出了许多关于测量和减轻幻觉文本的研究，但以前从未对这些进行过全面的审查。

因此，在本次调查中，我们对 NLG 幻觉问题的研究进展和挑战进行了广泛的概述。该调查分为两部分：(1) 指标、缓解方法和未来方向的总体概述，以及 (2) 以下下游任务中关于幻觉的特定任务研究进展概述，即抽象摘要、对话生成、生成式问答、数据到文本生成和机器翻译。这项调查有助于促进研究人员之间的协作努力，以应对 NLG 中幻觉文本的挑战。

更新日期：2023-03-03

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11