当前位置: X-MOL 学术Informatica › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dialogue act based expressive speech synthesis in limited domain for the Czech language
Informatica ( IF 3.3 ) Pub Date : 2020-06-15 , DOI: 10.31449/inf.v44i2.2559
Martin Grůber , Jindřich Matoušek , Zdeněk Hanzlíček , Daniel Tihelka

This paper deals with expressive speech synthesis in a dialogue. Dialogue acts - discrete expressive categories - are used for expressivity description. The aim of the work is to create a procedure for development of expressive speech synthesis for a dialogue system in a limited domain. The domain is here limited to dialogues between a human and a computer on a given topic of reminiscing about personal photographs. To incorporate expressivity into synthetic speech, modifications of current algorithms used for neutral speech synthesis are made. An expressive speech corpus is recorded, annotated using a predefined set of dialogue acts, and its acoustic analysis is performed. Unit selection and HMM-based methods are used to synthesize expressive speech, and an evaluation using listening tests is presented. The listeners asses two basic aspects of synthetic expressive speech for isolated utterances: speech quality and expressivity perception. The evaluation is also performed for utterances in a dialogue to asses appropriateness of synthetic expressive speech. It can be concluded that synthetic expressive speech is rated positively even though it is of worse quality when comparing with the neutral speech synthesis. However, synthetic expressive speech is able to transmit expressivity to listeners and to improve the naturalness of the synthetic speech.

中文翻译:

基于对话行为的捷克语有限域表达性语音合成

本文涉及对话中的表达性语音合成。对话行为——离散的表达类别——用于表达性描述。这项工作的目的是为有限领域的对话系统创建一个开发表达性语音合成的程序。该领域仅限于人与计算机之间关于回忆个人照片的给定主题的对话。为了将表现力结合到合成语音中,对当前用于中性语音合成的算法进行了修改。记录表达性语音语料库,使用一组预定义的对话行为进行注释,并执行其声学分析。单元选择和基于 HMM 的方法用于合成富有表现力的语音,并使用听力测试进行评估。听者评估孤立话语的合成表达性语音的两个基本方面:语音质量和表达性感知。还对对话中的话语进行评估,以评估合成表达性语音的适当性。可以得出结论,即使与中性语音合成相比,合成表达性语音的质量更差,但仍被评为正面。然而,合成表达语音能够将表达性传递给听者并提高合成语音的自然度。可以得出结论,即使与中性语音合成相比,合成表达性语音的质量较差,但仍被评为正面。然而,合成表达语音能够将表达性传递给听者并提高合成语音的自然度。可以得出结论,即使与中性语音合成相比,合成表达性语音的质量更差,但仍被评为正面。然而,合成表达语音能够将表达性传递给听者并提高合成语音的自然度。
更新日期:2020-06-15
down
wechat
bug