当前位置:
X-MOL 学术
›
arXiv.cs.LG
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
MetaDiff: Meta-Learning with Conditional Diffusion for Few-Shot Learning
arXiv - CS - Machine Learning Pub Date : 2023-07-31 , DOI: arxiv-2307.16424 Baoquan Zhang , Demin Yu
arXiv - CS - Machine Learning Pub Date : 2023-07-31 , DOI: arxiv-2307.16424 Baoquan Zhang , Demin Yu
Equipping a deep model the abaility of few-shot learning, i.e., learning
quickly from only few examples, is a core challenge for artificial
intelligence. Gradient-based meta-learning approaches effectively address the
challenge by learning how to learn novel tasks. Its key idea is learning a deep
model in a bi-level optimization manner, where the outer-loop process learns a
shared gradient descent algorithm (i.e., its hyperparameters), while the
inner-loop process leverage it to optimize a task-specific model by using only
few labeled data. Although these existing methods have shown superior
performance, the outer-loop process requires calculating second-order
derivatives along the inner optimization path, which imposes considerable
memory burdens and the risk of vanishing gradients. Drawing inspiration from
recent progress of diffusion models, we find that the inner-loop gradient
descent process can be actually viewed as a reverse process (i.e., denoising)
of diffusion where the target of denoising is model weights but the origin
data. Based on this fact, in this paper, we propose to model the gradient
descent optimizer as a diffusion model and then present a novel
task-conditional diffusion-based meta-learning, called MetaDiff, that
effectively models the optimization process of model weights from Gaussion
noises to target weights in a denoising manner. Thanks to the training
efficiency of diffusion models, our MetaDiff do not need to differentiate
through the inner-loop path such that the memory burdens and the risk of
vanishing gradients can be effectvely alleviated. Experiment results show that
our MetaDiff outperforms the state-of-the-art gradient-based meta-learning
family in few-shot learning tasks.
中文翻译:
MetaDiff:用于少样本学习的条件扩散元学习
为深度模型配备少样本学习的能力,即仅从少数样本中快速学习,是人工智能的核心挑战。基于梯度的元学习方法通过学习如何学习新任务来有效应对挑战。其关键思想是以双层优化方式学习深度模型,其中外循环过程学习共享梯度下降算法(即其超参数),而内循环过程利用它来优化特定于任务的模型仅使用少量标记数据。尽管这些现有方法表现出了优越的性能,但外循环过程需要沿着内优化路径计算二阶导数,这会带来相当大的内存负担和梯度消失的风险。从扩散模型的最新进展中汲取灵感,我们发现内环梯度下降过程实际上可以看作是扩散的逆过程(即去噪),其中去噪的目标是模型权重而不是原始数据。基于这一事实,在本文中,我们提出将梯度下降优化器建模为扩散模型,然后提出一种新颖的基于任务条件扩散的元学习,称为 MetaDiff,它可以有效地对高斯模型权重的优化过程进行建模以去噪方式将噪声去除到目标权重。得益于扩散模型的训练效率,我们的MetaDiff不需要通过内循环路径进行微分,从而可以有效减轻内存负担和梯度消失的风险。
更新日期:2023-08-01
中文翻译:
MetaDiff:用于少样本学习的条件扩散元学习
为深度模型配备少样本学习的能力,即仅从少数样本中快速学习,是人工智能的核心挑战。基于梯度的元学习方法通过学习如何学习新任务来有效应对挑战。其关键思想是以双层优化方式学习深度模型,其中外循环过程学习共享梯度下降算法(即其超参数),而内循环过程利用它来优化特定于任务的模型仅使用少量标记数据。尽管这些现有方法表现出了优越的性能,但外循环过程需要沿着内优化路径计算二阶导数,这会带来相当大的内存负担和梯度消失的风险。从扩散模型的最新进展中汲取灵感,我们发现内环梯度下降过程实际上可以看作是扩散的逆过程(即去噪),其中去噪的目标是模型权重而不是原始数据。基于这一事实,在本文中,我们提出将梯度下降优化器建模为扩散模型,然后提出一种新颖的基于任务条件扩散的元学习,称为 MetaDiff,它可以有效地对高斯模型权重的优化过程进行建模以去噪方式将噪声去除到目标权重。得益于扩散模型的训练效率,我们的MetaDiff不需要通过内循环路径进行微分,从而可以有效减轻内存负担和梯度消失的风险。




















































京公网安备 11010802027423号