Token-based typology and word order entropy: A study based on Universal Dependencies,Linguistic Typology

当前位置： X-MOL 学术 › Linguistic Typology › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Token-based typology and word order entropy: A study based on Universal Dependencies
Linguistic Typology ( IF 3.565 ) Pub Date : 2019-11-26 , DOI: 10.1515/lingty-2019-0025
Natalia Levshina

Abstract The present paper discusses the benefits and challenges of token-based typology, which takes into account the frequencies of words and constructions in language use. This approach makes it possible to introduce new criteria for language classification, which would be difficult or impossible to achieve with the traditional, type-based approach. This point is illustrated by several quantitative studies of word order variation, which can be measured as entropy at different levels of granularity. I argue that this variation can be explained by general functional mechanisms and pressures, which manifest themselves in language use, such as optimization of processing (including avoidance of ambiguity) and grammaticalization of predictable units occurring in chunks. The case studies are based on multilingual corpora, which have been parsed using the Universal Dependencies annotation scheme.

中文翻译：

基于令牌的类型学和词序熵：基于通用依赖关系的研究

摘要本文讨论了基于令牌的类型学的优点和挑战，它考虑了语言使用中单词和构造的出现频率。这种方法使引入新的语言分类标准成为可能，而传统的基于类型的方法则很难或不可能实现。通过对单词顺序变化的一些定量研究可以说明这一点，可以在不同粒度级别将其测量为熵。我认为这种变化可以用通用的功能机制和压力来解释，这些功能机制和压力在语言使用中表现出来，例如处理的优化（包括避免歧义）和成块出现的可预测单元的语法化。案例研究基于多语言语料库，

更新日期：2019-11-26

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>