当前位置: X-MOL 学术Phys. Rev. E › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Tangent-space gradient optimization of tensor network for machine learning.
Physical Review E ( IF 2.4 ) Pub Date : 2020-07-30 , DOI: 10.1103/physreve.102.012152
Zheng-Zhi Sun 1 , Shi-Ju Ran 2 , Gang Su 1, 3
Affiliation  

The gradient-based optimization method for deep machine learning models suffers from gradient vanishing and exploding problems, particularly when the computational graph becomes deep. In this work, we propose the tangent-space gradient optimization (TSGO) for probabilistic models to keep the gradients from vanishing or exploding. The central idea is to guarantee the orthogonality between variational parameters and gradients. The optimization is then implemented by rotating the parameter vector towards the direction of gradient. We explain and test TSGO in tensor network (TN) machine learning, where TN describes the joint probability distribution as a normalized state ψ in Hilbert space. We show that the gradient can be restricted in tangent space of ψ|ψ=1 hypersphere. Instead of additional adaptive methods to control the learning rate η in deep learning, the learning rate of TSGO is naturally determined by rotation angle θ as η=tanθ. Our numerical results reveal better convergence of TSGO in comparison to the off-the-shelf Adam.

中文翻译:

张量网络的切线空间梯度优化用于机器学习。

深度机器学习模型的基于梯度的优化方法存在梯度消失和爆炸的问题,尤其是在计算图变深时。在这项工作中,我们为概率模型提出了切线空间梯度优化(TSGO),以防止梯度消失或爆炸。中心思想是保证变化参数和梯度之间的正交性。然后通过朝着梯度方向旋转参数矢量来实现优化。我们在张量网络(TN)机器学习中解释和测试TSGO,其中TN将联合概率分布描述为归一化状态ψ在希尔伯特空间。我们证明梯度可以限制在切线空间ψ|ψ=1个超球体。代替其他自适应方法来控制学习率η 在深度学习中,TSGO的学习率自然取决于旋转角度 θη=棕褐色θ。我们的数值结果表明,与现成的Adam相比,TSGO的收敛性更好。
更新日期:2020-07-30
down
wechat
bug