当前位置: X-MOL 学术ACM Trans. Asian Low Resour. Lang. Inf. Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Developing the Persian Wordnet of Verbs Using Supervised Learning
ACM Transactions on Asian and Low-Resource Language Information Processing ( IF 2 ) Pub Date : 2021-05-26 , DOI: 10.1145/3450969
Zahra Mousavi 1 , Heshaam Faili 2
Affiliation  

Nowadays, wordnets are extensively used as a major resource in natural language processing and information retrieval tasks. Therefore, the accuracy of wordnets has a direct influence on the performance of the involved applications. This paper presents a fully-automated method for extending a previously developed Persian wordnet to cover more comprehensive and accurate verbal entries. At first, by using a bilingual dictionary, some Persian verbs are linked to Princeton WordNet synsets. A feature set related to the semantic behavior of compound verbs as the majority of Persian verbs is proposed. This feature set is employed in a supervised classification system to select the proper links for inclusion in the wordnet. We also benefit from a pre-existing Persian wordnet, FarsNet, and a similarity-based method to produce a training set. This is the largest automatically developed Persian wordnet with more than 27,000 words, 28,000 PWN synsets and 67,000 word-sense pairs that substantially outperforms the previous Persian wordnet with about 16,000 words, 22,000 PWN synsets and 38,000 word-sense pairs.

中文翻译:

使用监督学习开发波斯语动词词网

如今,wordnets 被广泛用作自然语言处理和信息检索任务的主要资源。因此,wordnets 的准确性直接影响相关应用程序的性能。本文提出了一种全自动方法,用于扩展先前开发的波斯语词网,以涵盖更全面和准确的口头词条。起初,通过使用双语词典,一些波斯动词被链接到普林斯顿 WordNet 同义词集。提出了一个与复合动词语义行为相关的特征集,作为波斯语动词的大多数。此功能集用于监督分类系统中,以选择正确的链接以包含在 wordnet 中。我们还受益于预先存在的波斯语词网 FarsNet 和基于相似性的方法来生成训练集。
更新日期:2021-05-26
down
wechat
bug