Grid Anchor Based Image Cropping: A New Benchmark and An Efficient Model,IEEE Transactions on Pattern Analysis and Machine Intelligence

当前位置： X-MOL 学术 › IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Grid Anchor Based Image Cropping: A New Benchmark and An Efficient Model
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 20.8 ) Pub Date : 2020-09-15 , DOI: 10.1109/tpami.2020.3024207
Hui Zeng ₁ , Lida Li ₁ , Zisheng Cao ₂ , Lei Zhang ₁

Affiliation

Image cropping aims to improve the composition as well as aesthetic quality of an image by removing extraneous content from it. Most of the existing image cropping databases provide only one or several human-annotated bounding boxes as the groundtruths, which can hardly reflect the non-uniqueness and flexibility of image cropping in practice. The employed evaluation metrics such as intersection-over-union cannot reliably reflect the real performance of a cropping model, either. This work revisits the problem of image cropping, and presents a grid anchor based formulation by considering the special properties and requirements (e.g., local redundancy, content preservation, aspect ratio) of image cropping. Our formulation reduces the searching space of candidate crops from millions to no more than ninety. Consequently, a grid anchor based cropping benchmark is constructed, where all crops of each image are annotated and more reliable evaluation metrics are defined. To meet the practical demands of robust performance and high efficiency, we also design an effective and lightweight cropping model. By simultaneously considering the region of interest and region of discard, and leveraging multi-scale information, our model can robustly output visually pleasing crops for images of different scenes. With less than 2.5M parameters, our model runs at a speed of 200 FPS on one single GTX 1080Ti GPU and 12 FPS on one i7-6800K CPU. The code is available at: https://github.com/HuiZeng/Grid-Anchor-based-Image-Cropping-Pytorch.

中文翻译：

基于网格锚点的图像裁剪：新基准和高效模型

图像裁剪旨在通过删除无关内容来改善图像的构图和美学质量。大多数现有的图像裁剪数据库仅提供一个或几个人工注释的边界框作为groundtruth，这很难反映实践中图像裁剪的非唯一性和灵活性。所采用的评估指标（例如交并集）也无法可靠地反映裁剪模型的真实性能。这项工作重新审视了图像裁剪的问题，并通过考虑图像裁剪的特殊属性和要求（例如，局部冗余、内容保留、长宽比），提出了基于网格锚的公式。我们的公式将候选作物的搜索空间从数百万减少到不超过九十。因此，构建了基于网格锚的裁剪基准，其中对每个图像的所有裁剪进行注释并定义更可靠的评估指标。为了满足鲁棒性能和高效率的实际需求，我们还设计了有效且轻量级的裁剪模型。通过同时考虑感兴趣区域和丢弃区域，并利用多尺度信息，我们的模型可以为不同场景的图像稳健地输出视觉上令人愉悦的作物。我们的模型参数少于 250 万，在单个 GTX 1080Ti GPU 上以 200 FPS 的速度运行，在一个 i7-6800K CPU 上以 12 FPS 的速度运行。该代码位于：https://github.com/HuiZeng/Grid-Anchor-based-Image-Cropping-Pytorch。

更新日期：2020-09-15

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11