On Buggy Resizing Libraries and Surprising Subtleties in FID Calculation,arXiv - CS - Graphics

当前位置： X-MOL 学术 › arXiv.cs.GR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

On Buggy Resizing Libraries and Surprising Subtleties in FID Calculation
arXiv - CS - Graphics Pub Date : 2021-04-22 , DOI: arxiv-2104.11222
Gaurav Parmar, Richard Zhang, Jun-Yan Zhu

We investigate the sensitivity of the Fr\'echet Inception Distance (FID) score to inconsistent and often incorrect implementations across different image processing libraries. FID score is widely used to evaluate generative models, but each FID implementation uses a different low-level image processing process. Image resizing functions in commonly-used deep learning libraries often introduce aliasing artifacts. We observe that numerous subtle choices need to be made for FID calculation and a lack of consistencies in these choices can lead to vastly different FID scores. In particular, we show that the following choices are significant: (1) selecting what image resizing library to use, (2) choosing what interpolation kernel to use, (3) what encoding to use when representing images. We additionally outline numerous common pitfalls that should be avoided and provide recommendations for computing the FID score accurately. We provide an easy-to-use optimized implementation of our proposed recommendations in the accompanying code.

中文翻译：

在FID计算中调整库的大小调整和令人惊讶的微妙之处

我们研究了Fr'echet起始距离（FID）分数对不同图像处理库中不一致且通常不正确的实现方式的敏感性。FID分数被广泛用于评估生成模型，但是每个FID实现都使用不同的低级图像处理过程。常用深度学习库中的图像调整大小功能通常会引入锯齿现象。我们观察到，对于FID计算，需要做出许多微妙的选择，并且这些选择中缺乏一致性会导致FID得分大相径庭。特别是，我们显示出以下选择是有意义的：（1）选择要使用的图像调整大小的库，（2）选择要使用的插值内核，（3）表示图像时使用的编码。我们还概述了应避免的许多常见陷阱，并提供了有关准确计算FID分数的建议。我们在随附的代码中提供了我们建议的建议的易于使用的优化实现。

更新日期：2021-04-23

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>