Dependency Distances and Their Frequencies in Indo-European Language,Journal of Quantitative Linguistics

当前位置： X-MOL 学术 › Journal of Quantitative Linguistics › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Dependency Distances and Their Frequencies in Indo-European Language
Journal of Quantitative Linguistics ( IF 0.7 ) Pub Date : 2020-06-18 , DOI: 10.1080/09296174.2020.1771135
Xinying Chen _{1,

2} , Kim Gerdes _{3,

4,

5}

Affiliation

ABSTRACT

The present study investigates the relationship between two features of dependencies, namely, dependency distances and dependency frequencies. The study is based on the analysis of a parallel dependency treebank that includes 10 Indo-European languages. Two corresponding random dependency treebanks are generated as baselines for comparison. After computing the values of dependency distances and their frequencies in these treebanks, for each lan-guage, we fit four functions, namely quadratic, exponent, logarithm, and power-law func-tions, to its original and random datasets. The preliminary result shows that there is a rela-tion between the two dependency features for all 10 Indo-European languages. The relation can be further formalized as a power-law function which can distinguish the observed data from randomly generated datasets.

中文翻译：

印欧语系中的依存距离及其频率

摘要

本研究调查了依赖的两个特征之间的关系，即依赖距离和依赖频率。该研究基于对包含 10 种印欧语系的并行依赖树库的分析。生成两个相应的随机依赖树库作为比较基准。在计算了这些树库中的依赖距离值及其频率之后，对于每种语言，我们将四个函数，即二次函数、指数函数、对数函数和幂律函数拟合到其原始数据集和随机数据集。初步结果表明，所有 10 种印欧语系的两种依存特征之间都存在相关性。该关系可以进一步形式化为幂律函数，可以将观察到的数据与随机生成的数据集区分开来。

更新日期：2020-06-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文