ImaginE: An Imagination-Based Automatic Evaluation Metric for Natural Language Generation,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

ImaginE: An Imagination-Based Automatic Evaluation Metric for Natural Language Generation
arXiv - CS - Computation and Language Pub Date : 2021-06-10 , DOI: arxiv-2106.05970
Wanrong Zhu, Xin Eric Wang, An Yan, Miguel Eckstein, William Yang Wang

Automatic evaluations for natural language generation (NLG) conventionally rely on token-level or embedding-level comparisons with the text references. This is different from human language processing, for which visual imaginations often improve comprehension. In this work, we propose ImaginE, an imagination-based automatic evaluation metric for natural language generation. With the help of CLIP and DALL-E, two cross-modal models pre-trained on large-scale image-text pairs, we automatically generate an image as the embodied imagination for the text snippet and compute the imagination similarity using contextual embeddings. Experiments spanning several text generation tasks demonstrate that adding imagination with our ImaginE displays great potential in introducing multi-modal information into NLG evaluation, and improves existing automatic metrics' correlations with human similarity judgments in many circumstances.

中文翻译：

ImaginE：一种基于想象力的自然语言生成自动评估指标

自然语言生成 (NLG) 的自动评估通常依赖于标记级或嵌入级与文本引用的比较。这与人类语言处理不同，视觉想象通常可以提高理解力。在这项工作中，我们提出了 ImaginE，这是一种基于想象力的自然语言生成自动评估指标。在 CLIP 和 DALL-E 这两个在大规模图像-文本对上预训练的跨模态模型的帮助下，我们自动生成图像作为文本片段的具身想象，并使用上下文嵌入计算想象相似性。跨越多个文本生成任务的实验表明，使用我们的 ImaginE 添加想象力在将多模态信息引入 NLG 评估方面显示出巨大的潜力，

更新日期：2021-06-11

点击分享查看原文

点击收藏

阅读更多本刊最新论文