当前位置: X-MOL 学术arXiv.cs.LG › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
MetaDiff: Meta-Learning with Conditional Diffusion for Few-Shot Learning
arXiv - CS - Machine Learning Pub Date : 2023-07-31 , DOI: arxiv-2307.16424
Baoquan Zhang ,  Demin Yu

Equipping a deep model the abaility of few-shot learning, i.e., learning quickly from only few examples, is a core challenge for artificial intelligence. Gradient-based meta-learning approaches effectively address the challenge by learning how to learn novel tasks. Its key idea is learning a deep model in a bi-level optimization manner, where the outer-loop process learns a shared gradient descent algorithm (i.e., its hyperparameters), while the inner-loop process leverage it to optimize a task-specific model by using only few labeled data. Although these existing methods have shown superior performance, the outer-loop process requires calculating second-order derivatives along the inner optimization path, which imposes considerable memory burdens and the risk of vanishing gradients. Drawing inspiration from recent progress of diffusion models, we find that the inner-loop gradient descent process can be actually viewed as a reverse process (i.e., denoising) of diffusion where the target of denoising is model weights but the origin data. Based on this fact, in this paper, we propose to model the gradient descent optimizer as a diffusion model and then present a novel task-conditional diffusion-based meta-learning, called MetaDiff, that effectively models the optimization process of model weights from Gaussion noises to target weights in a denoising manner. Thanks to the training efficiency of diffusion models, our MetaDiff do not need to differentiate through the inner-loop path such that the memory burdens and the risk of vanishing gradients can be effectvely alleviated. Experiment results show that our MetaDiff outperforms the state-of-the-art gradient-based meta-learning family in few-shot learning tasks.

中文翻译:

MetaDiff:用于少样本学习的条件扩散元学习

为深度模型配备少样本学习的能力,即仅从少数样本中快速学习,是人工智能的核心挑战。基于梯度的元学习方法通​​过学习如何学习新任务来有效应对挑战。其关键思想是以双层优化方式学习深度模型,其中外循环过程学习共享梯度下降算法(即其超参数),而内循环过程利用它来优化特定于任务的模型仅使用少量标记数据。尽管这些现有方法表现出了优越的性能,但外循环过程需要沿着内优化路径计算二阶导数,这会带来相当大的内存负担和梯度消失的风险。从扩散模型的最新进展中汲取灵感,我们发现内环梯度下降过程实际上可以看作是扩散的逆过程(即去噪),其中去噪的目标是模型权重而不是原始数据。基于这一事实,在本文中,我们提出将梯度下降优化器建模为扩散模型,然后提出一种新颖的基于任务条件扩散的元学习,称为 MetaDiff,它可以有效地对高斯模型权重的优化过程进行建模以去噪方式将噪声去除到目标权重。得益于扩散模型的训练效率,我们的MetaDiff不需要通过内循环路径进行微分,从而可以有效减轻内存负担和梯度消失的风险。
更新日期:2023-08-01
down
wechat
bug