Optimizing JPEG Quantization for Classification Networks,arXiv - CS - Graphics

当前位置： X-MOL 学术 › arXiv.cs.GR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Optimizing JPEG Quantization for Classification Networks
arXiv - CS - Graphics Pub Date : 2020-03-05 , DOI: arxiv-2003.02874
Zhijing Li, Christopher De Sa, Adrian Sampson

Deep learning for computer vision depends on lossy image compression: it reduces the storage required for training and test data and lowers transfer costs in deployment. Mainstream datasets and imaging pipelines all rely on standard JPEG compression. In JPEG, the degree of quantization of frequency coefficients controls the lossiness: an 8 by 8 quantization table (Q-table) decides both the quality of the encoded image and the compression ratio. While a long history of work has sought better Q-tables, existing work either seeks to minimize image distortion or to optimize for models of the human visual system. This work asks whether JPEG Q-tables exist that are "better" for specific vision networks and can offer better quality--size trade-offs than ones designed for human perception or minimal distortion. We reconstruct an ImageNet test set with higher resolution to explore the effect of JPEG compression under novel Q-tables. We attempt several approaches to tune a Q-table for a vision task. We find that a simple sorted random sampling method can exceed the performance of the standard JPEG Q-table. We also use hyper-parameter tuning techniques including bounded random search, Bayesian optimization, and composite heuristic optimization methods. The new Q-tables we obtained can improve the compression rate by 10% to 200% when the accuracy is fixed, or improve accuracy up to $2\%$ at the same compression rate.

中文翻译：

优化分类网络的 JPEG 量化

计算机视觉的深度学习依赖于有损图像压缩：它减少了训练和测试数据所需的存储空间，并降低了部署中的传输成本。主流数据集和成像管道都依赖于标准的 JPEG 压缩。在 JPEG 中，频率系数的量化程度控制着损失：一个 8 x 8 的量化表（Q 表）决定了编码图像的质量和压缩率。虽然长期以来的工作一直在寻求更好的 Q 表，但现有工作要么寻求最小化图像失真，要么针对人类视觉系统的模型进行优化。这项工作询问是否存在对特定视觉网络“更好”的 JPEG Q 表，并且可以提供更好的质量 - 与为人类感知或最小失真而设计的尺寸权衡。我们重建了一个具有更高分辨率的 ImageNet 测试集，以探索 JPEG 压缩在新 Q 表下的效果。我们尝试了几种方法来为视觉任务调整 Q 表。我们发现一个简单的排序随机抽样方法可以超过标准 JPEG Q-table 的性能。我们还使用超参数调整技术，包括有界随机搜索、贝叶斯优化和复合启发式优化方法。我们得到的新Q-tables可以在精度固定的情况下将压缩率提高10%到200%，或者在相同的压缩率下将精度提高到$2\%$。我们发现一个简单的排序随机抽样方法可以超过标准 JPEG Q-table 的性能。我们还使用超参数调整技术，包括有界随机搜索、贝叶斯优化和复合启发式优化方法。我们得到的新Q-tables可以在精度固定的情况下将压缩率提高10%到200%，或者在相同的压缩率下将精度提高到$2\%$。我们发现一个简单的排序随机抽样方法可以超过标准 JPEG Q-table 的性能。我们还使用超参数调整技术，包括有界随机搜索、贝叶斯优化和复合启发式优化方法。我们得到的新Q-tables可以在精度固定的情况下将压缩率提高10%到200%，或者在相同的压缩率下将精度提高到$2\%$。

更新日期：2020-03-09

点击分享查看原文

点击收藏

阅读更多本刊最新论文