当前位置: X-MOL 学术Journal of Quantitative Linguistics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dependency Distances and Their Frequencies in Indo-European Language
Journal of Quantitative Linguistics ( IF 0.7 ) Pub Date : 2020-06-18 , DOI: 10.1080/09296174.2020.1771135
Xinying Chen 1, 2 , Kim Gerdes 3, 4, 5
Affiliation  

ABSTRACT

The present study investigates the relationship between two features of dependencies, namely, dependency distances and dependency frequencies. The study is based on the analysis of a parallel dependency treebank that includes 10 Indo-European languages. Two corresponding random dependency treebanks are generated as baselines for comparison. After computing the values of dependency distances and their frequencies in these treebanks, for each lan-guage, we fit four functions, namely quadratic, exponent, logarithm, and power-law func-tions, to its original and random datasets. The preliminary result shows that there is a rela-tion between the two dependency features for all 10 Indo-European languages. The relation can be further formalized as a power-law function which can distinguish the observed data from randomly generated datasets.



中文翻译:

印欧语系中的依存距离及其频率

摘要

本研究调查了依赖的两个特征之间的关系,即依赖距离和依赖频率。该研究基于对包含 10 种印欧语系的并行依赖树库的分析。生成两个相应的随机依赖树库作为比较基准。在计算了这些树库中的依赖距离值及其频率之后,对于每种语言,我们将四个函数,即二次函数、指数函数、对数函数和幂律函数拟合到其原始数据集和随机数据集。初步结果表明,所有 10 种印欧语系的两种依存特征之间都存在相关性。该关系可以进一步形式化为幂律函数,可以将观察到的数据与随机生成的数据集区分开来。

更新日期:2020-06-18
down
wechat
bug