Approximation Algorithms for Training One-Node ReLU Neural Networks,IEEE Transactions on Signal Processing

当前位置： X-MOL 学术 › IEEE Trans. Signal Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Approximation Algorithms for Training One-Node ReLU Neural Networks
IEEE Transactions on Signal Processing ( IF 4.6 ) Pub Date : 2020-01-01 , DOI: 10.1109/tsp.2020.3039360
Santanu S. Dey , Guanyi Wang , Yao Xie

Training a one-node neural network with the ReLU activation function via optimization, which we refer to as the ON-ReLU problem, is a fundamental problem in machine learning. In this paper, we begin by proving the NP-hardness of the ON-ReLU problem. We then present an approximation algorithm to solve the ON-ReLU problem, whose running time is

$\mathcal {O}(n^k)$

where

$n$

is the number of samples, and

$k$

is a predefined integral constant as an algorithm parameter. We analyze the performance of this algorithm under two regimes and show that: (1) given any arbitrary set of training samples, the algorithm guarantees an

$(n/k)$

-approximation for the ON-ReLU problem – to the best of our knowledge, this is the first time that an algorithm guarantees an approximation ratio for arbitrary data scenario; thus, in the ideal case (i.e., when the training error is zero) the approximation algorithm achieves the globally optimal solution for the ON-ReLU problem; and (2) given training sample with Gaussian noise, the same approximation algorithm achieves a much better asymptotic approximation ratio which is independent of the number of samples

$n$

. Extensive numerical studies show that our approximation algorithm can perform better than the gradient descent algorithm. Our numerical results also show that the solution of the approximation algorithm can provide a good initialization for gradient descent, which can significantly improve the performance.

中文翻译：

用于训练单节点 ReLU 神经网络的近似算法

通过优化使用 ReLU 激活函数训练单节点神经网络，我们将其称为 ON-ReLU 问题，是机器学习中的一个基本问题。在本文中，我们首先证明 ON-ReLU 问题的 NP-hardness。然后我们提出了一个近似算法来解决 ON-ReLU 问题，其运行时间为

$\mathcal {O}(n^k)$

在哪里

$n$

是样本数，并且

$千$

是作为算法参数的预定义积分常数。我们分析了该算法在两种情况下的性能，并表明：（1）给定任意一组训练样本，该算法保证

$(n/k)$

-ON-ReLU 问题的近似——据我们所知，这是第一次有算法保证任意数据场景的近似率；因此，在理想情况下（即当训练误差为零时），近似算法实现了 ON-ReLU 问题的全局最优解；(2) 给定具有高斯噪声的训练样本，相同的逼近算法实现了更好的渐近逼近比，与样本数量无关

$n$

. 大量的数值研究表明，我们的近似算法可以比梯度下降算法执行得更好。我们的数值结果还表明，近似算法的解可以为梯度下降提供良好的初始化，可以显着提高性能。

更新日期：2020-01-01

点击分享查看原文