当前位置: X-MOL 学术Language Dynamics and Change › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Using ancestral state reconstruction methods for onomasiological reconstruction in multilingual word lists
Language Dynamics and Change ( IF 0.5 ) Pub Date : 2018-06-22 , DOI: 10.1163/22105832-00801002
Gerhard Jäger 1 , Johann-Mattis List 2
Affiliation  

Current efforts in computational historical linguistics are predominantly concerned with phylogenetic inference. Methods for ancestral state reconstruction have only been applied sporadically. In contrast to phylogenetic algorithms, automatic reconstruction methods presuppose phylogenetic information in order to explain what has evolved when and where. Here we report a pilot study exploring how well automatic methods for ancestral state reconstruction perform in the task of onomasiological reconstruction in multilingual word lists, where algorithms are used to infer how the words evolved along a given phylogeny, and reconstruct which cognate classes were used to express a given meaning in the ancestral languages. Comparing three different methods, Maximum Parsimony, Minimal Lateral Networks, and Maximum Likelihood on three different test sets (Indo-European, Austronesian, Chinese) using binary and multi-state coding of the data as well as single and sampled phylogenies, we find that Maximum Likelihood largely outperforms the other methods. At the same time, however, the general performance was disappointingly low, ranging between 0.66 (Chinese) and 0.79 (Austronesian) for the F-Scores. A closer linguistic evaluation of the reconstructions proposed by the best method and the reconstructions given in the gold standards revealed that the majority of the cases where the algorithms failed can be attributed to problems of independent semantic shift (homoplasy), to morphological processes in lexical change, and to wrong reconstructions in the independently created test sets that we employed.



中文翻译:

使用祖先状态重建方法进行多语言单词表的词义重建

当前在计算历史语言学方面的努力主要与系统发生推理有关。祖先状态重建的方法只是偶尔使用。与系统进化算法相反,自动重建方法以系统进化信息为前提,以解释何时何地发生了什么变化。在这里,我们报告了一项试点研究,探索在多种语言单词列表中的词法学重建任务中,祖先状态自动重建方法的性能如何,其中使用算法来推断单词如何沿给定的系统发育进化,并重建哪些关联类别用于用祖先语言表达给定的意思。比较三种不同的方法,最大简约,最小横向网络,在使用数据的二进制和多状态编码以及单个和采样系统发育树的三个不同测试集(印欧语系,南洋语,汉语)上的最大似然法,我们发现最大似然法在很大程度上优于其他方法。但是,与此同时,总体表现却令人失望地低下,其表现不佳,介于0.66(中文)和0.79(澳大利亚)之间。F-得分。对通过最佳方法提出的重构和黄金标准中给出的重构进行的语言学评估表明,算法失败的大多数情况都可归因于独立语义转移(同义)的问题,以及词汇变化中的形态过程,以及在我们使用的独立创建的测试集中进行错误的重构。

更新日期:2018-06-22
down
wechat
bug