当前位置: X-MOL 学术Expert Syst. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cross-lingual learning for text processing: A survey
Expert Systems with Applications ( IF 7.5 ) Pub Date : 2020-08-02 , DOI: 10.1016/j.eswa.2020.113765
Matúš Pikuliak , Marián Šimko , Mária Bieliková

Many intelligent systems in business, government or academy process natural language as an input for their inference or they might even communicate with users in natural language. The natural language processing within them is currently often done utilizing machine learning models. However, machine learning needs training data and such data are often absent for low-resource languages. The lack of data and resulting low performance of natural language processing can be solved with cross-lingual learning. Cross-lingual learning is a paradigm for transferring knowledge from one natural language to another. The transfer of knowledge can help us overcome the lack of data in the target languages and create intelligent systems and machine learning models for languages, where it was not possible previously.

Despite its increasing popularity and potential, no comprehensive survey on cross-lingual learning was conducted. We survey 172 text processing cross-lingual learning papers and examine tasks, datasets and languages that were used. The most important contribution of our work is that we identify and analyze four types of cross-lingual transfer based on “what” is being transferred. Such insight might help other NLP researchers and practitioners to understand how to use cross-lingual learning for wide range of problems. In addition, we identify what we consider to be the most important research directions that might help the community to focus their future work in cross-lingual learning. We present a comprehensive table of all the surveyed papers with various data related to the cross-lingual learning techniques they use. The table can be used to find relevant papers and compare the approaches to cross-lingual learning. To our best knowledge, no survey of cross-lingual text processing techniques was done in this scope before.



中文翻译:

跨语言学习的文本处理:一项调查

商业,政府或学院中的许多智能系统都将自然语言作为推论的输入,或者甚至可能以自然语言与用户进行交流。当前,其中的自然语言处理通常利用机器学习模型来完成。但是,机器学习需要训练数据,而对于资源匮乏的语言,通常缺少此类数据。缺乏数据并导致自然语言处理性能低下可以通过跨语言学习来解决。跨语言学习是将知识从一种自然语言转移到另一种自然语言的范例。知识的转移可以帮助我们克服目标语言中数据的缺乏,并为语言创建智能系统和机器学习模型,而这在以前是不可能的。

尽管它的受欢迎程度和潜力越来越大,但并未对跨语言学习进行全面的调查。我们调查了172种文本处理跨语言学习论文,并检查了所使用的任务,数据集和语言。我们的工作最重要的贡献是,我们根据正在传输的“内容”来识别和分析四种类型的跨语言传输。这样的洞察力可能会帮助其他NLP研究人员和从业者了解如何使用跨语言学习解决广泛的问题。此外,我们确定了我们认为最重要的研究方向,这些方向可能会帮助社区将未来的工作重点放在跨语言学习上。我们提供了一份所有被调查论文的综合表,其中包含与他们使用的跨语言学习技术有关的各种数据。该表可用于查找相关论文并比较跨语言学习的方法。据我们所知,在此范围内没有进行过跨语言文本处理技术的调查。

更新日期:2020-08-02
down
wechat
bug