当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Exploring Simple and Transferable Recognition-Aware Image Processing
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 23.6 ) Pub Date : 2022-06-15 , DOI: 10.1109/tpami.2022.3183243
Zhuang Liu 1 , Hungju Wang 1 , Tinghui Zhou 1 , Zhiquiang Shen 2 , Bingyi Kang 3 , Evan Shelhamer 1 , Trevor Darrell 1
Affiliation  

Recent progress in image recognition has stimulated the deployment of vision systems at an unprecedented scale. As a result, visual data are now often consumed not only by humans but also by machines. Existing image processing methods only optimize for better human perception, yet the resulting images may not be accurately recognized by machines. This can be undesirable, e.g., the images can be improperly handled by search engines or recommendation systems. In this work, we examine simple approaches to improve machine recognition of processed images: optimizing the recognition loss directly on the image processing network or through an intermediate input transformation model. Interestingly, the processing model's ability to enhance recognition quality can transfer when evaluated on models of different architectures, recognized categories, tasks, and training datasets. This makes the methods applicable even when we do not have the knowledge of future recognition models, e.g., when uploading processed images to the Internet. We conduct experiments on multiple image processing tasks paired with ImageNet classification and PASCAL VOC detection as recognition tasks. With these simple yet effective methods, substantial accuracy gain can be achieved with strong transferability and minimal image quality loss. Through a user study we further show that the accuracy gain can transfer to a black-box cloud model. Finally, we try to explain this transferability phenomenon by demonstrating the similarities of different models’ decision boundaries. Code is available at https://github.com/liuzhuang13/Transferable_RA .

中文翻译:

探索简单且可迁移的识别感知图像处理

图像识别的最新进展刺激了视觉系统以前所未有的规模部署。因此,视觉数据现在不仅被人类消费,而且被机器消费。现有的图像处理方法仅针对更好的人类感知进行优化,但生成的图像可能无法被机器准确识别。这可能是不希望的,例如,搜索引擎或推荐系统可能无法正确处理图像。在这项工作中,我们研究了改进处理图像机器识别的简单方法:直接在图像处理网络上或通过中间输入转换模型优化识别损失。有趣的是,处理模型提高识别质量的能力可以在对不同架构、识别类别、任务和训练数据集的模型进行评估时进行迁移。这使得这些方法即使在我们不了解未来识别模型的情况下也适用,例如,将处理过的图像上传到互联网时。我们对多个图像处理任务进行了实验,并与 ImageNet 分类和 PASCAL VOC 检测配对作为识别任务。通过这些简单而有效的方法,可以在强大的可移植性和最小的图像质量损失的情况下实现显着的精度增益。通过用户研究,我们进一步表明精度增益可以转移到黑盒云模型。最后,我们试图通过证明不同模型决策边界的相似性来解释这种可转移性现象。代码可在https://github.com/liuzhuang13/Transferable_RA .
更新日期:2022-06-15
down
wechat
bug