当前位置: X-MOL 学术Corpus Linguistics and Linguistic Theory › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Effects of task and corpus-derived association scores on the online processing of collocations
Corpus Linguistics and Linguistic Theory ( IF 2.143 ) Pub Date : 2019-05-10 , DOI: 10.1515/cllt-2018-0030
Kyla McConnell 1, 2 , Alice Blumenthal-Dramé 1, 2
Affiliation  

Abstract In the following self-paced reading study, we assess the cognitive realism of six widely used corpus-derived measures of association strength between words (collocated modifier–noun combinations like vast majority): MI, MI3, Dice coefficient, T-score, Z-score, and log-likelihood. The ability of these collocation metrics to predict reading times is tested against predictors of lexical processing cost that are widely established in the psycholinguistic and usage-based literature, respectively: forward/backward transition probability and bigram frequency. In addition, the experiment includes the treatment variable of task: it is split into two blocks which only differ in the format of interleaved comprehension questions (multiple choice vs. typed free response). Results show that the traditional corpus-linguistic metrics are outperformed by both backward transition probability and bigram frequency. Moreover, the multiple-choice condition elicits faster overall reading times than the typed condition, and the two winning metrics show stronger facilitation on the critical word (i.e. the noun in the bigrams) in the multiple-choice condition. In the typed condition, we find an effect that is weaker and, in the case of bigram frequency, longer lasting, continuing into the first spillover word. We argue that insufficient attention to task effects might have obscured the cognitive correlates of association scores in earlier research.

中文翻译:

任务和语料库衍生的关联分数对搭配在线处理的影响

摘要在以下自定进度阅读研究中,我们评估了六种广泛使用的语料库衍生的单词之间关联强度测量的认知现实性(搭配修饰词-名词组合,如绝大多数):MI、MI3、Dice 系数、T 分数、 Z 分数和对数似然。这些搭配指标预测阅读时间的能力通过心理语言学和基于使用的文献中广泛建立的词汇处理成本预测因子进行测试,分别是:前向/后向转换概率和二元组频率。此外,实验还包括任务的处理变量:它分为两个块,它们仅在交错理解问题的格式上有所不同(多项选择与打字自由回答)。结果表明,传统的语料库语言指标在后向转移概率和二元组频率上均优于传统的语料库语言指标。此外,多选条件比打字条件产生更快的整体阅读时间,并且两个获胜指标显示在多选条件下对关键词(即二元组中的名词)有更强的促进作用。在打字条件下,我们发现效果较弱,在二元组频率的情况下,持续时间更长,持续到第一个溢出词。我们认为,对任务效果的关注不足可能会掩盖早期研究中关联分数的认知相关性。并且两个获胜的度量显示了在多选条件下对关键词(即二元组中的名词)的更强促进作用。在打字条件下,我们发现效果较弱,在二元组频率的情况下,持续时间更长,持续到第一个溢出词。我们认为,对任务效果的关注不足可能会掩盖早期研究中关联分数的认知相关性。并且两个获胜的度量显示了在多选条件下对关键词(即二元组中的名词)的更强促进作用。在打字条件下,我们发现效果较弱,在二元组频率的情况下,持续时间更长,持续到第一个溢出词。我们认为,对任务效果的关注不足可能会掩盖早期研究中关联分数的认知相关性。
更新日期:2019-05-10
down
wechat
bug