A Survey of the Model Transfer Approaches to Cross-Lingual Dependency Parsing,ACM Transactions on Asian and Low-Resource Language Information Processing

当前位置： X-MOL 学术 › ACM Trans. Asian Low Resour. Lang. Inf. Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Survey of the Model Transfer Approaches to Cross-Lingual Dependency Parsing
ACM Transactions on Asian and Low-Resource Language Information Processing ( IF 2 ) Pub Date : 2020-06-01 , DOI: 10.1145/3383772
Ayan Das ₁ , Sudeshna Sarkar ₁

Affiliation

Cross-lingual dependency parsing approaches have been employed to develop dependency parsers for the languages for which little or no treebanks are available using the treebanks of other languages. A language for which the cross-lingual parser is developed is usually referred to as the target language and the language whose treebank is used to train the cross-lingual parser model is referred to as the source language. The cross-lingual parsing approaches for dependency parsing may be broadly classified into three categories: model transfer, annotation projection, and treebank translation. This survey provides an overview of the various aspects of the model transfer approach of cross-lingual dependency parsing. In this survey, we present a classification of the model transfer approaches based on the different aspects of the method. We discuss some of the challenges associated with cross-lingual parsing and the techniques used to address these challenges. In order to address the difference in vocabulary between two languages, some approaches use only non-lexical features of the words to train the models while others use shared representations of the words. Some approaches address the morphological differences by chunk-level transfer rather than word-level transfer. The syntactic differences between the source and target languages are sometimes addressed by transforming the source language treebanks or by combining the resources of multiple source languages. Besides cross-lingual transfer parser models may be developed for a specific target language or it may be trained to parse sentences of multiple languages. With respect to the above-mentioned aspects, we look at the different ways in which the methods can be classified. We further classify and discuss the different approaches from the perspective of the corresponding aspects. We also demonstrate the performance of the transferred models under different settings corresponding to the classification aspects on a common dataset.

中文翻译：

跨语言依赖解析的模型迁移方法综述

跨语言依赖解析方法已被用于为使用其他语言的树库提供很少或没有树库的语言开发依赖解析器。开发跨语言解析器的语言通常称为目标语言，使用树库训练跨语言解析器模型的语言称为源语言。依赖解析的跨语言解析方法可以大致分为三类：模型转移、注释投影和树库翻译。本调查概述了跨语言依赖解析的模型迁移方法的各个方面。在本次调查中，我们根据方法的不同方面对模型迁移方法进行了分类。我们讨论了与跨语言解析相关的一些挑战以及用于解决这些挑战的技术。为了解决两种语言之间的词汇差异，一些方法仅使用单词的非词汇特征来训练模型，而另一些方法使用单词的共享表示。一些方法通过块级传输而不是字级传输来解决形态差异。有时通过转换源语言树库或通过组合多种源语言的资源来解决源语言和目标语言之间的句法差异。此外，可以为特定的目标语言开发跨语言迁移解析器模型，或者可以对其进行训练以解析多种语言的句子。针对以上几个方面，我们看看可以对这些方法进行分类的不同方式。我们从相应方面的角度进一步分类和讨论不同的方法。我们还展示了在与公共数据集上的分类方面相对应的不同设置下传输模型的性能。

更新日期：2020-06-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>