当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion
arXiv - CS - Sound Pub Date : 2021-06-18 , DOI: arxiv-2106.10132
Disong Wang, Liqun Deng, Yu Ting Yeung, Xiao Chen, Xunying Liu, Helen Meng

One-shot voice conversion (VC), which performs conversion across arbitrary speakers with only a single target-speaker utterance for reference, can be effectively achieved by speech representation disentanglement. Existing work generally ignores the correlation between different speech representations during training, which causes leakage of content information into the speaker representation and thus degrades VC performance. To alleviate this issue, we employ vector quantization (VQ) for content encoding and introduce mutual information (MI) as the correlation metric during training, to achieve proper disentanglement of content, speaker and pitch representations, by reducing their inter-dependencies in an unsupervised manner. Experimental results reflect the superiority of the proposed method in learning effective disentangled speech representations for retaining source linguistic content and intonation variations, while capturing target speaker characteristics. In doing so, the proposed approach achieves higher speech naturalness and speaker similarity than current state-of-the-art one-shot VC systems. Our code, pre-trained models and demo are available at https://github.com/Wendison/VQMIVC.

中文翻译:

VQMIVC:用于一次性语音转换的矢量量化和基于互信息的无监督语音表示解开

一次性语音转换 (VC) 可以通过语音表示解开有效地实现,它可以在任意说话者之间执行转换,并且只有一个目标说话者的话语作为参考。现有工作通常在训练过程中忽略不同语音表示之间的相关性,这会导致内容信息泄漏到说话人表示中,从而降低 VC 性能。为了缓解这个问题,我们采用矢量量化 (VQ) 进行内容编码,并在训练期间引入互信息 (MI) 作为相关度量,通过减少它们在无监督中的相互依赖性,实现内容、说话者和音调表示的适当解开。方式。实验结果反映了所提出的方法在学习有效解开语音表示以保留源语言内容和语调变化,同时捕获目标说话人特征方面的优越性。这样做时,所提出的方法比当前最先进的一次性 VC 系统实现了更高的语音自然度和说话人相似度。我们的代码、预训练模型和演示可从 https://github.com/Wendison/VQMIVC 获得。
更新日期:2021-07-22
down
wechat
bug