当前位置: X-MOL 学术ETRI J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Zero‐anaphora resolution in Korean based on deep language representation model: BERT
ETRI Journal ( IF 1.3 ) Pub Date : 2020-10-25 , DOI: 10.4218/etrij.2019-0441
Youngtae Kim 1 , Dongyul Ra 1 , Soojong Lim 2
Affiliation  

It is necessary to achieve high performance in the task of zero anaphora resolution (ZAR) for completely understanding the texts in Korean, Japanese, Chinese, and various other languages. Deep‐learning‐based models are being employed for building ZAR systems, owing to the success of deep learning in the recent years. However, the objective of building a high‐quality ZAR system is far from being achieved even using these models. To enhance the current ZAR techniques, we fine‐tuned a pre‐trained bidirectional encoder representations from transformers (BERT). Notably, BERT is a general language representation model that enables systems to utilize deep bidirectional contextual information in a natural language text. It extensively exploits the attention mechanism based upon the sequence‐transduction model Transformer. In our model, classification is simultaneously performed for all the words in the input word sequence to decide whether each word can be an antecedent. We seek end‐to‐end learning by disallowing any use of hand‐crafted or dependency‐parsing features. Experimental results show that compared with other models, our approach can significantly improve the performance of ZAR.

中文翻译:

基于深度语言表示模型的朝鲜语零指称解析

为了完全理解韩语,日语,中文和其他各种语言的文本,必须在零回指解析(ZAR)的任务中实现高性能。由于近年来深度学习的成功,基于深度学习的模型被用于构建ZAR系统。但是,即使使用这些模型,构建高质量ZAR系统的目标也远未实现。为了增强当前的ZAR技术,我们对来自变压器(BERT)的预训练双向编码器表示进行了微调。值得注意的是,BERT是一种通用语言表示模型,它使系统能够利用自然语言文本中的深层双向上下文信息。它广泛地利用了基于序列转导模型《变形金刚》的注意力机制。在我们的模型中 对输入单词序列中的所有单词同时执行分类,以确定每个单词是否可以作为先行词。我们通过禁止使用任何手工制作或依赖解析的功能来寻求端到端的学习。实验结果表明,与其他模型相比,我们的方法可以显着提高ZAR的性能。
更新日期:2020-10-25
down
wechat
bug