当前位置: X-MOL 学术arXiv.cs.MM › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Modeling Musical Structure with Artificial Neural Networks
arXiv - CS - Multimedia Pub Date : 2020-01-06 , DOI: arxiv-2001.01720
Stefan Lattner

In recent years, artificial neural networks (ANNs) have become a universal tool for tackling real-world problems. ANNs have also shown great success in music-related tasks including music summarization and classification, similarity estimation, computer-aided or autonomous composition, and automatic music analysis. As structure is a fundamental characteristic of Western music, it plays a role in all these tasks. Some structural aspects are particularly challenging to learn with current ANN architectures. This is especially true for mid- and high-level self-similarity, tonal and rhythmic relationships. In this thesis, I explore the application of ANNs to different aspects of musical structure modeling, identify some challenges involved and propose strategies to address them. First, using probability estimations of a Restricted Boltzmann Machine (RBM), a probabilistic bottom-up approach to melody segmentation is studied. Then, a top-down method for imposing a high-level structural template in music generation is presented, which combines Gibbs sampling using a convolutional RBM with gradient-descent optimization on the intermediate solutions. Furthermore, I motivate the relevance of musical transformations in structure modeling and show how a connectionist model, the Gated Autoencoder (GAE), can be employed to learn transformations between musical fragments. For learning transformations in sequences, I propose a special predictive training of the GAE, which yields a representation of polyphonic music as a sequence of intervals. Furthermore, the applicability of these interval representations to a top-down discovery of repeated musical sections is shown. Finally, a recurrent variant of the GAE is proposed, and its efficacy in music prediction and modeling of low-level repetition structure is demonstrated.

中文翻译:

用人工神经网络模拟音乐结构

近年来,人工神经网络 (ANN) 已成为解决现实世界问题的通用工具。人工神经网络在与音乐相关的任务中也取得了巨大的成功,包括音乐摘要和分类、相似性估计、计算机辅助或自主创作以及自动音乐分析。由于结构是西方音乐的基本特征,它在所有这些任务中都发挥着作用。使用当前的 ANN 架构学习某些结构方面特别具有挑战性。对于中高层次的自相似性、音调和节奏关系尤其如此。在本论文中,我探索了 ANN 在音乐结构建模的不同方面的应用,确定了所涉及的一些挑战并提出了解决这些挑战的策略。第一的,使用受限玻尔兹曼机 (RBM) 的概率估计,研究了一种概率自下而上的旋律分割方法。然后,提出了一种在音乐生成中强加高级结构模板的自顶向下方法,该方法将使用卷积 RBM 的 Gibbs 采样与对中间解的梯度下降优化相结合。此外,我激发了结构建模中音乐转换的相关性,并展示了如何使用连接模型,门控自动编码器 (GAE) 来学习音乐片段之间的转换。为了学习序列中的转换,我提出了一种特殊的 GAE 预测训练,它将复调音乐表示为一个间隔序列。此外,显示了这些间隔表示对重复音乐部分的自上而下发现的适用性。最后,提出了 GAE 的循环变体,并证明了其在音乐预测和低级重复结构建模中的功效。
更新日期:2020-01-08
down
wechat
bug