当前位置: X-MOL 学术IETE Tech. Rev. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Particular Character Speech Synthesis System Based on Deep Learning
IETE Technical Review ( IF 2.4 ) Pub Date : 2020-10-08 , DOI: 10.1080/02564602.2020.1824623
Yuan Mei 1 , Deng-pan Ye 1 , Shun-zhi Jiang 1 , Jia-rui Liu 1
Affiliation  

ABSTRACT

The speech synthesis system of a particular character is a TTS (text-to-speech) synthetic system, which can obtain voice with the specific speaker’s voice characteristics. The traditional method, based on machine learning, requires a great amount of training samples and large iterations. In this paper, we proposed a novel TTS system based on fully convolutional neural networks and attention mechanism. The system can be trained start from scratch with random initialization and realize end-to-end output. By adding the attention layer and the loss of attention, it can better adapt to the features of the pronunciation, intonation and accent of a specific speaker. Experimental results show that our speech synthesis framework demonstrates a stronger model performance by synthesizing higher quality forged specific character audio with a smaller training set and lesser iterations.



中文翻译:

基于深度学习的特殊字符语音合成系统

摘要

特定角色的语音合成系统是TTS(文本到语音)合成系统,它可以获取具有特定说话者语音特征的语音。基于机器学习的传统方法需要大量的训练样本和大量迭代。在本文中,我们提出了一种基于全卷积神经网络和注意力机制的新型TTS系统。可以通过随机初始化从头开始训练系统,并实现端到端输出。通过增加注意力层和注意力丧失,它可以更好地适应特定说话者的发音,语调和口音特征。

更新日期:2020-10-08
down
wechat
bug