The Use of Naturalistic Reading Corpora for the Study of Pronoun and Coreference Resolution,Language and Linguistics Compass

当前位置： X-MOL 学术 › Language and Linguistics Compass › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

The Use of Naturalistic Reading Corpora for the Study of Pronoun and Coreference Resolution
Language and Linguistics Compass ( IF 2.8 ) Pub Date : 2020-09-22 , DOI: 10.1111/lnc3.12395
Olga Seminck ₁

Affiliation

Naturalistic reading corpora are collections of texts that were not designed to be used in specific linguistic studies and that were read by participants whose eye‐movements or reading time was measured. These resources are used to study the cognitive processing of linguistic phenomena naturally present in texts, and they encourage the development of robust models of cognitive linguistic processing. These properties make the use of natural text corpora interesting for the study of pronoun and coreference resolution. In the psycholinguistic literature, many linguistic factors that have an influence on pronoun and coreference resolution have been identified but there is still a lot unknown about the interaction of these factors in naturalistic data. In addition, items used in psycholinguistic studies are short; therefore, naturalistic reading corpora are a resource to study pronoun and coreference resolution in realistic discourse. In this survey, we discuss the models for pronoun and coreference resolution that have been developed so far. We explain the methodological challenges related to the use of naturalistic data and speculate how such data can be used to evaluate theories of pronoun and coreference resolution and so lead to the development of broad coverage models in which various linguistic levels (syntax, semantics and discourse) are integrated.

中文翻译：

自然主义阅读语料库在代词和共指消解研究中的应用

自然主义阅读语料库是并非旨在用于特定语言研究的文本的集合，并且是由测量了眼动或阅读时间的参与者阅读的。这些资源用于研究文本中自然存在的语言现象的认知处理，并鼓励开发健壮的认知语言处理模型。这些属性使自然文本语料库的使用对于代词和共指解析的研究很有趣。在心理语言学文献中，已经确定了许多对代词和共指分解产生影响的语言因素，但在自然数据中这些因素之间的相互作用仍然未知。另外，用于语言学研究的项目很短；所以，自然主义阅读语料库是研究现实语篇中代词和共指分解的一种资源。在本次调查中，我们讨论了到目前为止已经开发的代词和共指解决模型。我们解释了与使用自然主义数据有关的方法论挑战，并推测如何使用此类数据评估代词和共指解决的理论，从而导致广泛覆盖模型的发展，在该模型中，各种语言水平（语法，语义和话语）被集成。

更新日期：2020-09-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文