A survey of word embeddings based on deep learning,Computing

当前位置： X-MOL 学术 › Computing › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A survey of word embeddings based on deep learning
Computing ( IF 3.7 ) Pub Date : 2019-11-12 , DOI: 10.1007/s00607-019-00768-7
Shirui Wang , Wenan Zhou , Chao Jiang

The representational basis for downstream natural language processing tasks is word embeddings, which capture lexical semantics in numerical form to handle the abstract semantic concept of words. Recently, the word embeddings approaches, represented by deep learning, has attracted extensive attention and widely used in many tasks, such as text classification, knowledge mining, question-answering, smart Internet of Things systems and so on. These neural networks-based models are based on the distributed hypothesis while the semantic association between words can be efficiently calculated in low-dimensional space. However, the expressed semantics of most models are constrained by the context distribution of each word in the corpus while the logic and common knowledge are not better utilized. Therefore, how to use the massive multi-source data to better represent natural language and world knowledge still need to be explored. In this paper, we introduce the recent advances of neural networks-based word embeddings with their technical features, summarizing the key challenges and existing solutions, and further give a future outlook on the research and application.

中文翻译：

基于深度学习的词嵌入综述

下游自然语言处理任务的表征基础是词嵌入，它以数字形式捕获词汇语义以处理词的抽象语义概念。近年来，以深度学习为代表的词嵌入方法受到广泛关注，并广泛应用于文本分类、知识挖掘、问答、智能物联网系统等诸多任务中。这些基于神经网络的模型基于分布式假设，而单词之间的语义关联可以在低维空间中有效计算。然而，大多数模型的表达语义受到语料库中每个词的上下文分布的约束，而逻辑和常识没有得到更好的利用。所以，如何利用海量的多源数据更好地表示自然语言和世界知识，仍有待探索。在本文中，我们介绍了基于神经网络的词嵌入的最新进展及其技术特点，总结了关键挑战和现有解决方案，并进一步展望了未来的研究和应用。

更新日期：2019-11-12

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>