当前位置: X-MOL 学术arXiv.cs.DL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Corpus of Adpositional Supersenses for Mandarin Chinese
arXiv - CS - Digital Libraries Pub Date : 2020-03-18 , DOI: arxiv-2003.08437
Siyao Peng, Yang Liu, Yilun Zhu, Austin Blodgett, Yushi Zhao, Nathan Schneider

Adpositions are frequent markers of semantic relations, but they are highly ambiguous and vary significantly from language to language. Moreover, there is a dearth of annotated corpora for investigating the cross-linguistic variation of adposition semantics, or for building multilingual disambiguation systems. This paper presents a corpus in which all adpositions have been semantically annotated in Mandarin Chinese; to the best of our knowledge, this is the first Chinese corpus to be broadly annotated with adposition semantics. Our approach adapts a framework that defined a general set of supersenses according to ostensibly language-independent semantic criteria, though its development focused primarily on English prepositions (Schneider et al., 2018). We find that the supersense categories are well-suited to Chinese adpositions despite syntactic differences from English. On a Mandarin translation of The Little Prince, we achieve high inter-annotator agreement and analyze semantic correspondences of adposition tokens in bitext.

中文翻译:

汉语普通话附加超义语料库

副词是语义关系的常用标记,但它们非常模糊,并且在不同语言之间差异很大。此外,缺乏用于研究介词语义的跨语言变异或用于构建多语言消歧系统的注释语料库。本文提出了一个语料库,其中所有的配词都在普通话中进行了语义注释;据我们所知,这是第一个广泛使用附加语义标注的中文语料库。我们的方法采用了一个框架,该框架根据表面上独立于语言的语义标准定义了一组通用的超义,尽管其发展主要集中在英语介词上(Schneider 等,2018)。我们发现尽管与英语在句法上存在差异,但超义类别非常适合汉语配词。在《小王子》的中文翻译中,我们实现了高度的注释者间一致性,并分析了双文本中附加标记的语义对应关系。
更新日期:2020-03-20
down
wechat
bug