A Particular Character Speech Synthesis System Based on Deep Learning,IETE Technical Review

当前位置： X-MOL 学术 › IETE Tech. Rev. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Particular Character Speech Synthesis System Based on Deep Learning
IETE Technical Review ( IF 2.4 ) Pub Date : 2020-10-08 , DOI: 10.1080/02564602.2020.1824623
Yuan Mei ₁ , Deng-pan Ye ₁ , Shun-zhi Jiang ₁ , Jia-rui Liu ₁

Affiliation

ABSTRACT

The speech synthesis system of a particular character is a TTS (text-to-speech) synthetic system, which can obtain voice with the specific speaker’s voice characteristics. The traditional method, based on machine learning, requires a great amount of training samples and large iterations. In this paper, we proposed a novel TTS system based on fully convolutional neural networks and attention mechanism. The system can be trained start from scratch with random initialization and realize end-to-end output. By adding the attention layer and the loss of attention, it can better adapt to the features of the pronunciation, intonation and accent of a specific speaker. Experimental results show that our speech synthesis framework demonstrates a stronger model performance by synthesizing higher quality forged specific character audio with a smaller training set and lesser iterations.

中文翻译：

基于深度学习的特殊字符语音合成系统

摘要

特定角色的语音合成系统是TTS（文本到语音）合成系统，它可以获取具有特定说话者语音特征的语音。基于机器学习的传统方法需要大量的训练样本和大量迭代。在本文中，我们提出了一种基于全卷积神经网络和注意力机制的新型TTS系统。可以通过随机初始化从头开始训练系统，并实现端到端输出。通过增加注意力层和注意力丧失，它可以更好地适应特定说话者的发音，语调和口音特征。

更新日期：2020-10-08

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>