当前位置: X-MOL 学术IEEE Trans. Signal Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Guaranteed Recovery of One-Hidden-Layer Neural Networks via Cross Entropy
IEEE Transactions on Signal Processing ( IF 4.6 ) Pub Date : 2020-01-01 , DOI: 10.1109/tsp.2020.2993153
Haoyu Fu , Yuejie Chi , Yingbin Liang

We study model recovery for data classification, where the training labels are generated from a one-hidden-layer neural network with sigmoid activations, also known as a single-layer feedforward network, and the goal is to recover the weights of the neural network. We consider two network models, the fully-connected network (FCN) and the non-overlapping convolutional neural network (CNN). We prove that with Gaussian inputs, the empirical risk based on cross entropy exhibits strong convexity and smoothness uniformly in a local neighborhood of the ground truth, as soon as the sample complexity is sufficiently large. This implies that if initialized in this neighborhood, gradient descent converges linearly to a critical point that is provably close to the ground truth. Furthermore, we show such an initialization can be obtained via the tensor method. This establishes the global convergence guarantee for empirical risk minimization using cross entropy via gradient descent for learning one-hidden-layer neural networks, at the near-optimal sample and computational complexity with respect to the network input dimension without unrealistic assumptions such as requiring a fresh set of samples at each iteration.

中文翻译:

通过交叉熵保证单隐藏层神经网络的恢复

我们研究数据分类的模型恢复,其中训练标签是从具有 sigmoid 激活的单隐藏层神经网络生成的,也称为单层前馈网络,目标是恢复神经网络的权重。我们考虑两种网络模型,全连接网络(FCN)和非重叠卷积神经网络(CNN)。我们证明,使用高斯输入,只要样本复杂度足够大,基于交叉熵的经验风险就会在地面实况的局部邻域中均匀地表现出强凸性和平滑性。这意味着如果在这个邻域中初始化,梯度下降线性收敛到一个临界点,该临界点可证明接近于真实情况。此外,我们展示了可以通过张量方法获得这样的初始化。
更新日期:2020-01-01
down
wechat
bug