当前位置: X-MOL 学术Knowl. Based Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hybrid neural conditional random fields for multi-view sequence labeling
Knowledge-Based Systems ( IF 8.8 ) Pub Date : 2019-10-24 , DOI: 10.1016/j.knosys.2019.105151
Xuli Sun , Shiliang Sun , Minzhi Yin , Hao Yang

In traditional machine learning, conditional random fields (CRF) is the mainstream probability model for sequence labeling problems. CRF considers the relation between adjacent labels other than decoding each label independently, and better performance is expected to achieve. However, there are few multi-view learning methods involving CRF which can be directly used for sequence labeling tasks. In this paper, we propose a novel multi-view CRF model to label sequential data, called MVCRF, which well exploits two principles for multi-view learning: consensus and complementary. We first use different neural networks to extract features from multiple views. Then, considering the consistency among the different views, we introduce a joint representation space for the extracted features and minimize the distance between the two views for regularization. Meanwhile, following the complementary principle, the features of multiple views are integrated into the framework of CRF. We train MVCRF in an end-to-end fashion and evaluate it on two benchmark data sets. The experimental results illustrate that MVCRF obtains state-of-the-art performance: F1 score 95.44% for chunking on CoNLL-2000, 95.06% for chunking and 96.99% for named entity recognition (NER) on CoNLL-2003.



中文翻译:

混合神经条件随机场用于多视图序列标记

在传统的机器学习中,条件随机场(CRF)是序列标记问题的主流概率模型。除了独立地解码每个标签外,CRF还考虑了相邻标签之间的关系,并且有望实现更好的性能。但是,很少有涉及CRF的多视图学习方法可直接用于序列标记任务。在本文中,我们提出了一种新颖的多视图CRF模型来标记顺序数据,称为MVCRF,该模型很好地利用了多视图学习的两个原理:共识和互补。我们首先使用不同的神经网络从多个视图中提取特征。然后,考虑到不同视图之间的一致性,我们为提取的特征引入联合表示空间,并最小化两个视图之间的距离以进行正则化。同时,遵循补充原则,将多种视图的特征集成到CRF框架中。我们以端到端的方式训练MVCRF,并在两个基准数据集上对其进行评估。实验结果表明,MVCRF获得了最先进的性能:F1个 CoNLL-2000上的分块得分为95.44%,分块的得分为95.06%,CoNLL-2003的命名实体识别(NER)得分为96.99%。

更新日期:2020-01-16
down
wechat
bug