当前位置: X-MOL 学术IEEE Circuits Syst. Mag. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Applications of Deep Learning to Audio Generation
IEEE Circuits and Systems Magazine ( IF 6.9 ) Pub Date : 2019-01-01 , DOI: 10.1109/mcas.2019.2945210
Yuanjun Zhao , Xianjun Xia , Roberto Togneri

In the recent past years, deep learning based machine learning systems have demonstrated remarkable success for a wide range of learning tasks in multiple domains such as computer vision, speech recognition and other pattern recognition based applications. The purpose of this article is to contribute a timely review and introduction of state-of-the-art deep learning techniques and their effectiveness in speech/acoustic signal processing. Thorough investigations of various deep learning architectures are provided under the categories of discriminative and generative algorithms, including the up-to-date Generative Adversarial Networks (GANs) as an integrated model. A comprehensive overview of applications in audio generation is highlighted. Based on understandings from these approaches, we discuss how deep learning methods can benefit the field of speech/acoustic signal synthesis and the potential issues that need to be addressed for prospective real-world scenarios. We hope this survey provides a valuable reference for practitioners seeking to innovate in the usage of deep learning approaches for speech/acoustic signal generation.

中文翻译:

深度学习在音频生成中的应用

近年来,基于深度学习的机器学习系统在计算机视觉、语音识别和其他基于模式识别的应用等多个领域的广泛学习任务中取得了显着的成功。本文的目的是及时回顾和介绍最先进的深度学习技术及其在语音/声学信号处理中的有效性。在判别和生成算法的类别下提供了对各种深度学习架构的彻底调查,包括最新的生成对抗网络 (GAN) 作为集成模型。突出显示了音频生成中应用的全面概述。基于对这些方法的理解,我们讨论了深度学习方法如何使语音/声学信号合成领域受益,以及在未来的现实世界场景中需要解决的潜在问题。我们希望这项调查为寻求创新使用深度学习方法生成语音/声学信号的从业者提供有价值的参考。
更新日期:2019-01-01
down
wechat
bug