当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Generating coherent spontaneous speech and gesture from text
arXiv - CS - Sound Pub Date : 2021-01-14 , DOI: arxiv-2101.05684
Simon Alexanderson, Éva Székely, Gustav Eje Henter, Taras Kucherenko, Jonas Beskow

Embodied human communication encompasses both verbal (speech) and non-verbal information (e.g., gesture and head movements). Recent advances in machine learning have substantially improved the technologies for generating synthetic versions of both of these types of data: On the speech side, text-to-speech systems are now able to generate highly convincing, spontaneous-sounding speech using unscripted speech audio as the source material. On the motion side, probabilistic motion-generation methods can now synthesise vivid and lifelike speech-driven 3D gesticulation. In this paper, we put these two state-of-the-art technologies together in a coherent fashion for the first time. Concretely, we demonstrate a proof-of-concept system trained on a single-speaker audio and motion-capture dataset, that is able to generate both speech and full-body gestures together from text input. In contrast to previous approaches for joint speech-and-gesture generation, we generate full-body gestures from speech synthesis trained on recordings of spontaneous speech from the same person as the motion-capture data. We illustrate our results by visualising gesture spaces and text-speech-gesture alignments, and through a demonstration video at https://simonalexanderson.github.io/IVA2020 .

中文翻译:

从文本生成连贯的自发语音和手势

体现的人际交流涵盖了言语(语音)和非言语信息(例如手势和头部运动)。机器学习的最新进展大大改进了用于生成这两种类型数据的合成版本的技术:在语音方面,文本到语音系统现在能够使用非脚本化语音音频来生成具有说服力的自发语音。原始资料。在运动方面,概率运动生成方法现在可以合成生动逼真的语音驱动3D手势。在本文中,我们首次以连贯的方式将这两种最先进的技术结合在一起。具体而言,我们演示了在单扬声器音频和运动捕获数据集上训练的概念验证系统,能够从文本输入中同时生成语音和全身手势。与以前的用于联合语音和手势生成的方法相比,我们从语音合成中生成全身手势,该手势是在与运动捕捉数据相同的人的自发语音记录上进行训练的。我们通过可视化手势空间和文本语音手势对齐方式并通过https://simonalexanderson.github.io/IVA2020的演示视频来说明我们的结果。
更新日期:2021-01-15
down
wechat
bug