Conditional LSTM-GAN for Melody Generation from Lyrics,ACM Transactions on Multimedia Computing, Communications, and Applications

当前位置： X-MOL 学术 › ACM Trans. Multimed. Comput. Commun. Appl. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Conditional LSTM-GAN for Melody Generation from Lyrics
ACM Transactions on Multimedia Computing, Communications, and Applications ( IF 5.1 ) Pub Date : 2021-04-16 , DOI: 10.1145/3424116
Yi Yu ₁ , Abhishek Srivastava ₂ , Simon Canales ₃

Affiliation

Melody generation from lyrics has been a challenging research issue in the field of artificial intelligence and music, which enables us to learn and discover latent relationships between interesting lyrics and accompanying melodies. Unfortunately, the limited availability of a paired lyrics–melody dataset with alignment information has hindered the research progress. To address this problem, we create a large dataset consisting of 12,197 MIDI songs each with paired lyrics and melody alignment through leveraging different music sources where alignment relationship between syllables and music attributes is extracted. Most importantly, we propose a novel deep generative model, conditional Long Short-Term Memory (LSTM)–Generative Adversarial Network for melody generation from lyrics, which contains a deep LSTM generator and a deep LSTM discriminator both conditioned on lyrics. In particular, lyrics-conditioned melody and alignment relationship between syllables of given lyrics and notes of predicted melody are generated simultaneously. Extensive experimental results have proved the effectiveness of our proposed lyrics-to-melody generative model, where plausible and tuneful sequences can be inferred from lyrics.

中文翻译：

从歌词中生成旋律的条件 LSTM-GAN

从歌词生成旋律一直是人工智能和音乐领域的一个具有挑战性的研究问题，它使我们能够学习和发现有趣的歌词和伴随旋律之间的潜在关系。不幸的是，具有对齐信息的配对歌词 - 旋律数据集的有限可用性阻碍了研究进展。为了解决这个问题，我们创建了一个由 12,197 首 MIDI 歌曲组成的大型数据集，每首歌曲都有成对的歌词和旋律对齐方式，通过利用不同的音乐源提取音节和音乐属性之间的对齐关系。最重要的是，我们提出了一种新颖的深度生成模型，条件长短期记忆（LSTM）——生成对抗网络，用于从歌词中生成旋律，它包含一个深度 LSTM 生成器和一个深度 LSTM 鉴别器，两者都以歌词为条件。特别地，歌词条件旋律和给定歌词的音节与预测旋律的音符之间的对齐关系是同时生成的。广泛的实验结果证明了我们提出的歌词到旋律生成模型的有效性，在该模型中，可以从歌词中推断出合理和悦耳的序列。

更新日期：2021-04-16

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>