Diffusion Models Beat GANs on Image Synthesis,arXiv - CS - Machine Learning

当前位置： X-MOL 学术 › arXiv.cs.LG › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Diffusion Models Beat GANs on Image Synthesis
arXiv - CS - Machine Learning Pub Date : 2021-05-11 , DOI: arxiv-2105.05233
Prafulla Dhariwal, Alex Nichol

We show that diffusion models can achieve image sample quality superior to the current state-of-the-art generative models. We achieve this on unconditional image synthesis by finding a better architecture through a series of ablations. For conditional image synthesis, we further improve sample quality with classifier guidance: a simple, compute-efficient method for trading off diversity for sample quality using gradients from a classifier. We achieve an FID of 2.97 on ImageNet $128 \times 128$, 4.59 on ImageNet $256 \times 256$, and $7.72$ on ImageNet $512 \times 512$, and we match BigGAN-deep even with as few as 25 forward passes per sample, all while maintaining better coverage of the distribution. Finally, we find that classifier guidance combines well with upsampling diffusion models, further improving FID to 3.85 on ImageNet $512 \times 512$. We release our code at https://github.com/openai/guided-diffusion

中文翻译：

扩散模型在图像合成上击败了GAN

我们表明，扩散模型可以实现优于当前最新生成模型的图像样本质量。我们通过一系列烧蚀找到更好的体系结构，从而实现了无条件图像合成。对于条件图像合成，我们在分类器指导下进一步提高了样本质量：一种简单，计算效率高的方法，可以使用来自分类器的梯度来权衡样本质量的多样性。我们在ImageNet $ 128 \ times 128 $上获得了2.97的FID，在ImageNet $ 256 \ times 256 $上获得了4.59的FID，在ImageNet $ 512 \ times 512 $上获得了$ 7.72 $的FID，并且即使每个样本只有25个正向传递，我们也可以匹配BigGAN深度，同时保持更好的发行范围。最后，我们发现分类器指导与上采样扩散模型很好地结合在一起，进一步将ImageNet $ 512 \ times 512 $上的FID提高到3.85。

更新日期：2021-05-12

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>