当前位置: X-MOL 学术Speech Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Discriminative neural network pruning in a multiclass environment: A case study in spoken emotion recognition
Speech Communication ( IF 3.2 ) Pub Date : 2020-04-02 , DOI: 10.1016/j.specom.2020.03.006
Máximo E. Sánchez-Gutiérrez , Pedro P. González-Pérez

Deep learning has become one of the most widely accepted paradigms regarding machine learning. It focuses on the use of hierarchical data models and builds upon the notion that in order to learn about high level data representations, a better understanding of intermediate level representation is needed. Restricted Boltzmann Machines and deep belief networks are two main types of deep learning algorithms commonly used in a wide array of classification and pattern recognition tasks. Examples of these tasks are natural language recognition, neuroimaging studies, forecasting time series, parametric voice synthesis, and speech emotion recognition among others. Recent machine learning studies suggest that deep learning networks can help map feature problems into a more advantageous position, hence improving the classification process. However, selecting a suitable Deep learning architecture in response to a specific problem can be difficult. In this study, we intend to investigate whether discriminative measures, such as Anova, Pearsonâs Correlation, Fisher score, Gain ratio, ReliefF, OneR among others, could offer pointers to identify useful neural nods in a Deep learning network. This is due to the fact that normally not all hidden neurons provide insightful information for a classification task. Our approach consists in using some of these discriminative measures to rank the hidden neurons based on their output values, and then prune them in accordance to their position within said ranking. Our results indicate that this approach is also helpful in multiclass classification problems and the pruning process seems to have a positive effect in diminishing the resulting error rate.



中文翻译:

多类环境中的歧视性神经网络修剪:以语音情感识别为例

深度学习已成为有关机器学习的最广泛接受的范例之一。它着重于分层数据模型的使用,并基于以下概念:为了学习高级数据表示,需要对中级表示有更好的理解。受限的Boltzmann机器和深度信念网络是深度学习算法的两种主要类型,通常在各种各样的分类和模式识别任务中使用。这些任务的示例包括自然语言识别,神经影像学研究,预测时间序列,参数语音合成以及语音情感识别等。最近的机器学习研究表明,深度学习网络可以帮助将特征问题映射到更有利的位置,从而改善分类过程。然而,针对特定问题选择合适的深度学习架构可能很困难。在这项研究中,我们打算调查判别性措施(例如Anova,Pearson's Correlation,Fisher分数,增益比,ReliefF,OneR等)是否可以为识别深度学习网络中有用的神经点提供指导。这是由于以下事实:通常并非所有隐藏的神经元都为分类任务提供有洞察力的信息。我们的方法包括使用这些判别措施中的一些,根据隐藏神经元的输出值对它们进行排名,然后根据它们在所述排名中的位置进行修剪。我们的结果表明,该方法还有助于解决多类分类问题,并且修剪过程似乎对减少错误率有积极作用。

更新日期:2020-04-02
down
wechat
bug