Speech enhancement through improvised conditional generative adversarial networks,Microprocessors and Microsystems

当前位置： X-MOL 学术 › Microprocess. Microsyst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Speech enhancement through improvised conditional generative adversarial networks
Microprocessors and Microsystems ( IF 1.9 ) Pub Date : 2020-09-24 , DOI: 10.1016/j.micpro.2020.103281
Saravana Ram Ram , Vinoth Kumar M , Balambigai Subramanian , Nebojsa Bacanin , Miodrag Zivkovic , Ivana Strumberger

Speech enhancement works towards improvising the quality of speech through various post processing algorithms. Intelligibility enhancement along with overall perceptual quality score improvement is the main objective of many speech signal processing techniques. Generative Adversarial Networks (GAN) aims to generate a new set of data with the help of training set statistics and is seen to be impressive for enhancing the speech signals in the recent years. Though GAN's does not involve prior and posterior probability calculations, they are hard to train in general. The problem aggravates with low-data regime and hence there is a need for effective GAN mechanism. In this research work, we propose to use an improvised conditional generative adversarial network where the generator will enhance the input data that is noisy while the discriminator on the other hand embedded with improvised techniques will try to differentiate between the generator output and the database clean content with the help of GAN conditions discussed. The results of the proposed method are assessed in terms of PEAQ score and equal error rate. Experimental results from the Aurora-2 signal set proves us that the improved cGAN is very effective as compared to traditional GAN networks. We have also tested the algorithm with the subjective preference and 82.14% of the subjects were found to prefer the proposed cGAN to that obtained with other conventional methods.

中文翻译：

通过即兴的条件生成对抗网络进行语音增强

语音增强功能旨在通过各种后处理算法来提高语音质量。可理解性的提高以及整体感知质量得分的提高是许多语音信号处理技术的主要目标。生成对抗网络（GAN）旨在借助训练集统计信息生成一组新的数据，并且近年来在增强语音信号方面令人印象深刻。尽管GAN不涉及先验概率和后验概率的计算，但它们通常很难训练。低数据状态使问题更加严重，因此需要有效的GAN机制。在这项研究工作中，我们建议使用临时的条件生成对抗网络，其中生成器将增强嘈杂的输入数据，而另一方面，嵌入了简易技术的鉴别器将尝试借助GAN来区分生成器输出和数据库干净内容条件讨论。根据PEAQ得分和均等错误率评估了所提出方法的结果。Aurora-2信号集的实验结果证明，与传统GAN网络相比，改进的cGAN非常有效。我们还以主观偏好对算法进行了测试，发现82.14％的受试者比其他常规方法更喜欢拟议的cGAN。

更新日期：2020-09-29

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11