当前位置: X-MOL 学术Nat. Lang. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A note on constituent parsing for Korean
Natural Language Engineering ( IF 2.3 ) Pub Date : 2020-11-10 , DOI: 10.1017/s1351324920000479
Mija Kim 1 , Jungyeul Park 2
Affiliation  

This study deals with widespread issues on constituent parsing for Korean including the quantitative and qualitative error analyses on parsing results. The previous treebank grammars have been accepted as being interpretable in the various annotation schemes, whereas the recent parsers turn out to be much harder for humans to interpret. This paper, therefore, intends to find the concrete typology of parsing errors, to describe how these parsers deal with sentences and to show their statistical distribution, using state-of-the-art statistical and neural parsers. For doing this work, we train and evaluate the phrase structure Sejong treebank using statistical and neural parsing systems and obtain results up to a 89.18% F $_1$ score, which outperforms previous constituent parsing results for Korean. We also define best practices for correct comparison to future work by proposing the standard corpus division for the Sejong treebank.

中文翻译:

韩语成分解析注意事项

本研究涉及韩语成分解析的普遍问题,包括对解析结果的定量和定性错误分析。以前的树库语法已被接受为可以在各种注释方案中解释,而最近的解析器对人类来说更难解释。因此,本文旨在找到解析错误的具体类型,描述这些解析器如何处理句子并显示它们的统计分布,使用最先进的统计和神经解析器。为了完成这项工作,我们使用统计和神经解析系统训练和评估短语结构 Sejong 树库,并获得高达 89.18% F 的结果$_1$分数,这优于以前的韩语成分解析结果。我们还通过为世宗树库提出标准语料库划分来定义与未来工作进行正确比较的最佳实践。
更新日期:2020-11-10
down
wechat
bug