Scaling Neural Tangent Kernels via Sketching and Random Features,arXiv - CS - Data Structures and Algorithms

当前位置： X-MOL 学术 › arXiv.cs.DS › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Scaling Neural Tangent Kernels via Sketching and Random Features
arXiv - CS - Data Structures and Algorithms Pub Date : 2021-06-15 , DOI: arxiv-2106.07880
Amir Zandieh, Insu Han, Haim Avron, Neta Shoham, Chaewon Kim, Jinwoo Shin

The Neural Tangent Kernel (NTK) characterizes the behavior of infinitely-wide neural networks trained under least squares loss by gradient descent. Recent works also report that NTK regression can outperform finitely-wide neural networks trained on small-scale datasets. However, the computational complexity of kernel methods has limited its use in large-scale learning tasks. To accelerate learning with NTK, we design a near input-sparsity time approximation algorithm for NTK, by sketching the polynomial expansions of arc-cosine kernels: our sketch for the convolutional counterpart of NTK (CNTK) can transform any image using a linear runtime in the number of pixels. Furthermore, we prove a spectral approximation guarantee for the NTK matrix, by combining random features (based on leverage score sampling) of the arc-cosine kernels with a sketching algorithm. We benchmark our methods on various large-scale regression and classification tasks and show that a linear regressor trained on our CNTK features matches the accuracy of exact CNTK on CIFAR-10 dataset while achieving 150x speedup.

中文翻译：

通过草图和随机特征缩放神经切线核

神经切线核 (NTK) 表征了在梯度下降的最小二乘损失下训练的无限宽神经网络的行为。最近的工作还报告说，NTK 回归可以胜过在小规模数据集上训练的有限范围神经网络。然而，核方法的计算复杂性限制了它在大规模学习任务中的使用。为了加速使用 NTK 的学习，我们为 NTK 设计了一种近似输入稀疏时间近似算法，通过勾画反余弦核的多项式展开：我们的 NTK 卷积对应物（CNTK）的草图可以使用线性运行时间转换任何图像像素数。此外，我们证明了 NTK 矩阵的谱近似保证，通过将反余弦内核的随机特征（基于杠杆分数采样）与草图算法相结合。我们在各种大规模回归和分类任务上对我们的方法进行了基准测试，并表明在我们的 CNTK 特征上训练的线性回归器与 CIFAR-10 数据集上的精确 CNTK 的准确性相匹配，同时实现了 150 倍的加速。

更新日期：2021-06-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文