当前位置: X-MOL 学术Lang. Resour. Eval. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Building referring expression corpora with and without feedback
Language Resources and Evaluation ( IF 2.7 ) Pub Date : 2020-07-08 , DOI: 10.1007/s10579-020-09497-2
Danillo da Silva Rocha , Ivandré Paraboni

The design of data collection experiments involving human participants is a common task in Referring Expression Generation (REG) and related fields. Many (or most) REG data collection tasks are implemented by making use of a human–computer (e.g., web-based) communicative setting, in which participants do not have any particular addressee in mind and do not receive any feedback regarding the appropriateness (e.g., uniqueness) of the descriptions that they produce. Others, at a possibly higher cost, make use of participant pairs engaged in some form of dialogue in which hearers may provide feedback allowing speakers to rephrase ambiguous or otherwise ill-formed descriptions. Leaving the issue of cost aside, however, it remains unclear whether the two methods elicit similar referring expressions for the purpose of REG research. To shed light on this issue, this paper presents a REG corpus built under three experimental conditions: a standard human–computer (or web-based) setting in which no feedback is available to the speaker, and two settings in which feedback regarding the appropriateness of the description may be provided either by an automated parsing tool or by a second participant at the receiving end of the communication. The corpus contains fully annotated descriptions in two domains—simple geometric objects and realistic human face images—and it is provided as a resource for the training and testing of REG algorithms in these communicative settings.



中文翻译:

建立带或不带反馈的引用表达语料库

涉及人类参与者的数据收集实验的设计是引用表达生成(REG)和相关领域的一项常见任务。许多(或大多数)REG数据收集任务是通过使用人机(例如,基于Web的)交流设置来实现的,其中参与者没有特定的收件人,也不会收到有关适当性的任何反馈(例如,唯一性)。其他人则可能以参与对话的某种形式使用参与者对,在这种对话中,听众可能会提供反馈,从而允许讲话者重新说明模棱两可或不正确的描述。然而,撇开成本问题,目前尚不清楚这两种方法是否为REG研究目的引起相似的引用表达。为了阐明这一问题,本文介绍了在三种实验条件下构建的REG语料库:标准的人机(或基于Web)设置,其中说话者没有反馈,以及两种设置,其中涉及关于适当性的反馈可以通过自动解析工具或由通信接收端的第二参与者提供描述的内容。语料库在两个领域(简单的几何对象和逼真的人脸图像)中包含完全带注释的描述,并且提供该语料库作为在这些交流环境中训练和测试REG算法的资源。以及两个设置,其中可以通过自动解析工具或在通信的接收端由第二参与者提供有关描述的适当性的反馈。语料库在两个领域(简单的几何对象和逼真的人脸图像)中包含带有完整注释的描述,并且提供该语料库作为在这些交流环境中训练和测试REG算法的资源。以及两个设置,其中可以通过自动解析工具或在通信的接收端由第二参与者提供有关描述的适当性的反馈。语料库在两个领域(简单的几何对象和逼真的人脸图像)中包含完全带注释的描述,并且提供该语料库作为在这些交流环境中训练和测试REG算法的资源。

更新日期:2020-07-24
down
wechat
bug