当前位置: X-MOL 学术J. Mem. Lang. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Reading the written language environment: Learning orthographic structure from statistical regularities
Journal of Memory and Language ( IF 4.3 ) Pub Date : 2020-10-01 , DOI: 10.1016/j.jml.2020.104148
Teresa Marie Schubert , Trevor Cohen , Simon Fischer-Baum

Abstract Statistical regularities in the environment impact cognition across domains. In semantics, distributional approaches posit that similarity between words can be derived from regularities of the contexts in which they appear. Here, we study how regularities in written text impact readers’ knowledge about orthography: Can similarity between characters be learned from the written environment? Adapting methods from distributional semantics, we model the contextual similarity among alphanumeric characters in a large text corpus. We find modest correlations between model-derived similarities with similarity derived from a behavioral experiment. Beyond this result, model-derived similarity from neural embedding models captures key aspects of orthographic knowledge, like case, letter identity and consonant–vowel status. We conclude that the text environment contains regularities that are relevant to readers and that statistical learning is a promising way for this information to be acquired. More broadly, our results imply that statistical regularities are relevant not only at the level of word semantics but also individual written characters.

中文翻译:

阅读书面语言环境:从统计规律中学习拼写结构

摘要 环境中的统计规律影响跨领域的认知。在语义学中,分布式方法假定单词之间的相似性可以从它们出现的上下文的规律性中推导出来。在这里,我们研究书面文本中的规律性如何影响读者的拼写知识:能否从书面环境中学习字符之间的相似性?我们采用分布语义的方法,对大型文本语料库中字母数字字符之间的上下文相似性进行建模。我们发现模型衍生的相似性与行为实验衍生的相似性之间存在适度的相关性。除了这个结果之外,来自神经嵌入模型的模型衍生相似性捕获了拼写知识的关键方面,如大小写、字母标识和辅音 - 元音状态。我们得出结论,文本环境包含与读者相关的规律,并且统计学习是获取此信息的一种有前途的方式。更广泛地说,我们的结果意味着统计规律不仅与单词语义相关,而且与单个书面字符相关。
更新日期:2020-10-01
down
wechat
bug