当前位置: X-MOL 学术arXiv.cs.MM › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Crossing You in Style: Cross-modal Style Transfer from Music to Visual Arts
arXiv - CS - Multimedia Pub Date : 2020-09-17 , DOI: arxiv-2009.08083
Cheng-Che Lee, Wan-Yi Lin, Yen-Ting Shih, Pei-Yi Patricia Kuo, Li Su

Music-to-visual style transfer is a challenging yet important cross-modal learning problem in the practice of creativity. Its major difference from the traditional image style transfer problem is that the style information is provided by music rather than images. Assuming that musical features can be properly mapped to visual contents through semantic links between the two domains, we solve the music-to-visual style transfer problem in two steps: music visualization and style transfer. The music visualization network utilizes an encoder-generator architecture with a conditional generative adversarial network to generate image-based music representations from music data. This network is integrated with an image style transfer method to accomplish the style transfer process. Experiments are conducted on WikiArt-IMSLP, a newly compiled dataset including Western music recordings and paintings listed by decades. By utilizing such a label to learn the semantic connection between paintings and music, we demonstrate that the proposed framework can generate diverse image style representations from a music piece, and these representations can unveil certain art forms of the same era. Subjective testing results also emphasize the role of the era label in improving the perceptual quality on the compatibility between music and visual content.

中文翻译:

Crossing You in Style:从音乐到视觉艺术的跨模式风格转换

音乐到视觉风格的迁移是创造力实践中一个具有挑战性但重要的跨模态学习问题。它与传统图像风格迁移问题的主要区别在于风格信息是由音乐而不是图像提供的。假设音乐特征可以通过两个域之间的语义链接正确映射到视觉内容,我们分两步解决音乐到视觉风格转移问题:音乐可视化和风格转移。音乐可视化网络利用编码器-生成器架构和条件生成对抗网络从音乐数据生成基于图像的音乐表示。该网络与图像风格迁移方法相结合以完成风格迁移过程。实验在 WikiArt-IMSLP 上进行,一个新编译的数据集,包括几十年列出的西方音乐录音和绘画。通过利用这样的标签来学习绘画和音乐之间的语义联系,我们证明了所提出的框架可以从音乐作品中生成不同的图像风格表示,并且这些表示可以揭示同一时代的某些艺术形式。主观测试结果还强调了时代标签在提高音乐与视觉内容兼容性的感知质量方面的作用。
更新日期:2020-09-18
down
wechat
bug