当前位置: X-MOL 学术arXiv.cs.GR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Iterative Text-based Editing of Talking-heads Using Neural Retargeting
arXiv - CS - Graphics Pub Date : 2020-11-21 , DOI: arxiv-2011.10688
Xinwei Yao, Ohad Fried, Kayvon Fatahalian, Maneesh Agrawala

We present a text-based tool for editing talking-head video that enables an iterative editing workflow. On each iteration users can edit the wording of the speech, further refine mouth motions if necessary to reduce artifacts and manipulate non-verbal aspects of the performance by inserting mouth gestures (e.g. a smile) or changing the overall performance style (e.g. energetic, mumble). Our tool requires only 2-3 minutes of the target actor video and it synthesizes the video for each iteration in about 40 seconds, allowing users to quickly explore many editing possibilities as they iterate. Our approach is based on two key ideas. (1) We develop a fast phoneme search algorithm that can quickly identify phoneme-level subsequences of the source repository video that best match a desired edit. This enables our fast iteration loop. (2) We leverage a large repository of video of a source actor and develop a new self-supervised neural retargeting technique for transferring the mouth motions of the source actor to the target actor. This allows us to work with relatively short target actor videos, making our approach applicable in many real-world editing scenarios. Finally, our refinement and performance controls give users the ability to further fine-tune the synthesized results.

中文翻译:

基于神经重定向的基于重复文本的谈话头编辑

我们提供了一个基于文本的工具,用于编辑会说话的视频,从而启用了迭代式编辑工作流程。在每次迭代中,用户可以编辑语音措辞,必要时进一步细化嘴巴动作,以减少伪像,并通过插入嘴巴手势(例如微笑)或更改整体表演风格(例如精力充沛,喃喃自语)来操纵表演的非语言方面。 )。我们的工具仅需要2-3分钟的目标演员视频,并且可以在大约40秒内为每次迭代合成视频,从而使用户可以在迭代过程中快速探索多种编辑可能性。我们的方法基于两个关键思想。(1)我们开发了一种快速的音素搜索算法,该算法可以快速识别与所需编辑最匹配的源存储库视频的音素级子序列。这将启用我们的快速迭代循环。(2)我们利用源演员的大量视频资源,并开发了一种新的自监督神经重定位技术,用于将源演员的嘴巴动作转移到目标演员。这使我们可以处理相对较短的目标演员视频,从而使我们的方法适用于许多现实世界中的编辑场景。最后,我们的优化和性能控制使用户能够进一步微调合成结果。
更新日期:2020-11-25
down
wechat
bug