当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning a Lie Algebra from Unlabeled Data Pairs
arXiv - CS - Sound Pub Date : 2020-09-19 , DOI: arxiv-2009.09321
Christopher Ick and Vincent Lostanlen

Deep convolutional networks (convnets) show a remarkable ability to learn disentangled representations. In recent years, the generalization of deep learning to Lie groups beyond rigid motion in $\mathbb{R}^n$ has allowed to build convnets over datasets with non-trivial symmetries, such as patterns over the surface of a sphere. However, one limitation of this approach is the need to explicitly define the Lie group underlying the desired invariance property before training the convnet. Whereas rotations on the sphere have a well-known symmetry group ($\mathrm{SO}(3)$), the same cannot be said of many real-world factors of variability. For example, the disentanglement of pitch, intensity dynamics, and playing technique remains a challenging task in music information retrieval. This article proposes a machine learning method to discover a nonlinear transformation of the space $\mathbb{R}^n$ which maps a collection of $n$-dimensional vectors $(\boldsymbol{x}_i)_i$ onto a collection of target vectors $(\boldsymbol{y}_i)_i$. The key idea is to approximate every target $\boldsymbol{y}_i$ by a matrix--vector product of the form $\boldsymbol{\widetilde{y}}_i = \boldsymbol{\phi}(t_i) \boldsymbol{x}_i$, where the matrix $\boldsymbol{\phi}(t_i)$ belongs to a one-parameter subgroup of $\mathrm{GL}_n (\mathbb{R})$. Crucially, the value of the parameter $t_i \in \mathbb{R}$ may change between data pairs $(\boldsymbol{x}_i, \boldsymbol{y}_i)$ and does not need to be known in advance.

中文翻译:

从未标记的数据对中学习李代数

深度卷积网络 (convnets) 显示出学习解开表示的非凡能力。近年来,在 $\mathbb{R}^n$ 中将深度学习推广到超越刚性运动的李群已经允许在具有非平凡对称性的数据集上构建卷积神经网络,例如球体表面上的模式。然而,这种方法的一个限制是需要在训练 convnet 之前明确定义所需不变性的李群。虽然球体上的旋转有一个众所周知的对称群 ($\mathrm{SO}(3)$),但对于许多现实世界的可变因素却不是这样。例如,音高、强度动态和演奏技巧的解开仍然是音乐信息检索中的一项具有挑战性的任务。本文提出了一种机器学习方法来发现空间 $\mathbb{R}^n$ 的非线性变换,该变换将 $n$ 维向量集合 $(\boldsymbol{x}_i)_i$ 映射到一组目标向量 $(\boldsymbol{y}_i)_i$。关键思想是通过矩阵来近似每个目标 $\boldsymbol{y}_i$ - 形式为 $\boldsymbol{\widetilde{y}}_i = \boldsymbol{\phi}(t_i) \boldsymbol{ x}_i$,其中矩阵 $\boldsymbol{\phi}(t_i)$ 属于 $\mathrm{GL}_n (\mathbb{R})$ 的单参数子群。至关重要的是,参数 $t_i \in \mathbb{R}$ 的值可能会在数据对 $(\boldsymbol{x}_i, \boldsymbol{y}_i)$ 之间发生变化,并且不需要提前知道。其中矩阵 $\boldsymbol{\phi}(t_i)$ 属于 $\mathrm{GL}_n (\mathbb{R})$ 的单参数子群。至关重要的是,参数 $t_i \in \mathbb{R}$ 的值可能会在数据对 $(\boldsymbol{x}_i, \boldsymbol{y}_i)$ 之间发生变化,并且不需要提前知道。其中矩阵 $\boldsymbol{\phi}(t_i)$ 属于 $\mathrm{GL}_n (\mathbb{R})$ 的单参数子群。至关重要的是,参数 $t_i \in \mathbb{R}$ 的值可能会在数据对 $(\boldsymbol{x}_i, \boldsymbol{y}_i)$ 之间发生变化,并且不需要提前知道。
更新日期:2020-11-13
down
wechat
bug