当前位置: X-MOL 学术IEEE Trans. Signal Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Approximation Algorithms for Training One-Node ReLU Neural Networks
IEEE Transactions on Signal Processing ( IF 4.6 ) Pub Date : 2020-01-01 , DOI: 10.1109/tsp.2020.3039360
Santanu S. Dey , Guanyi Wang , Yao Xie

Training a one-node neural network with the ReLU activation function via optimization, which we refer to as the ON-ReLU problem, is a fundamental problem in machine learning. In this paper, we begin by proving the NP-hardness of the ON-ReLU problem. We then present an approximation algorithm to solve the ON-ReLU problem, whose running time is $\mathcal {O}(n^k)$ where $n$ is the number of samples, and $k$ is a predefined integral constant as an algorithm parameter. We analyze the performance of this algorithm under two regimes and show that: (1) given any arbitrary set of training samples, the algorithm guarantees an $(n/k)$-approximation for the ON-ReLU problem – to the best of our knowledge, this is the first time that an algorithm guarantees an approximation ratio for arbitrary data scenario; thus, in the ideal case (i.e., when the training error is zero) the approximation algorithm achieves the globally optimal solution for the ON-ReLU problem; and (2) given training sample with Gaussian noise, the same approximation algorithm achieves a much better asymptotic approximation ratio which is independent of the number of samples $n$. Extensive numerical studies show that our approximation algorithm can perform better than the gradient descent algorithm. Our numerical results also show that the solution of the approximation algorithm can provide a good initialization for gradient descent, which can significantly improve the performance.

中文翻译:

用于训练单节点 ReLU 神经网络的近似算法

通过优化使用 ReLU 激活函数训练单节点神经网络,我们将其称为 ON-ReLU 问题,是机器学习中的一个基本问题。在本文中,我们首先证明 ON-ReLU 问题的 NP-hardness。然后我们提出了一个近似算法来解决 ON-ReLU 问题,其运行时间为$\mathcal {O}(n^k)$ 在哪里 $n$ 是样本数,并且 $千$是作为算法参数的预定义积分常数。我们分析了该算法在两种情况下的性能,并表明:(1)给定任意一组训练样本,该算法保证$(n/k)$-ON-ReLU 问题的近似——据我们所知,这是第一次有算法保证任意数据场景的近似率;因此,在理想情况下(即当训练误差为零时),近似算法实现了 ON-ReLU 问题的全局最优解;(2) 给定具有高斯噪声的训练样本,相同的逼近算法实现了更好的渐近逼近比,与样本数量无关$n$. 大量的数值研究表明,我们的近似算法可以比梯度下降算法执行得更好。我们的数值结果还表明,近似算法的解可以为梯度下降提供良好的初始化,可以显着提高性能。
更新日期:2020-01-01
down
wechat
bug