当前位置: X-MOL 学术Front. Comput. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sememe knowledge computation: a review of recent advances in application and expansion of sememe knowledge bases
Frontiers of Computer Science ( IF 3.4 ) Pub Date : 2021-04-29 , DOI: 10.1007/s11704-020-0002-4
Fanchao Qi , Ruobing Xie , Yuan Zang , Zhiyuan Liu , Maosong Sun

A sememe is defined as the minimum semantic unit of languages in linguistics. Sememe knowledge bases are built by manually annotating sememes for words and phrases. HowNet is the most well-known sememe knowledge base. It has been extensively utilized in many natural language processing tasks in the era of statistical natural language processing and proven to be effective and helpful to understanding and using languages. In the era of deep learning, although data are thought to be of vital importance, there are some studies working on incorporating sememe knowledge bases like HowNet into neural network models to enhance system performance. Some successful attempts have been made in the tasks including word representation learning, language modeling, semantic composition, etc. In addition, considering the high cost of manual annotation and update for sememe knowledge bases, some work has tried to use machine learning methods to automatically predict sememes for words and phrases to expand sememe knowledge bases. Besides, some studies try to extend HowNet to other languages by automatically predicting sememes for words and phrases in a new language. In this paper, we summarize recent studies on application and expansion of sememe knowledge bases and point out some future directions of research on sememes.



中文翻译:

Sememe知识计算:对Sememe知识库的应用和扩展的最新进展的回顾

语素被定义为语言学中语言的最小语义单位。Sememe知识库是通过手动注释单词和短语的seme来构建的。知网是最著名的seme知识库。在统计自然语言处理时代,它已广泛用于许多自然语言处理任务中,并被证明对理解和使用语言有效且有帮助。在深度学习时代,尽管数据被认为是至关重要的,但仍有一些研究正在将诸如HowNet之类的sememe知识库整合到神经网络模型中,以增强系统性能。在任务中已经进行了一些成功的尝试,包括单词表示学习,语言建模,语义组成等。此外,考虑到人工读名和词素库更新的成本高昂,一些工作已尝试使用机器学习方法自动预测单词和短语的词素,以扩展词素库。此外,一些研究试图通过自动预测​​新语言中单词和短语的音素来将HowNet扩展到其他语言。在本文中,我们总结了有关音素知识库的应用和扩展的最新研究,并指出了有关音素的研究的未来方向。

更新日期:2021-04-30
down
wechat
bug