当前位置:
X-MOL 学术
›
arXiv.cs.MM
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
End-to-end Compression Towards Machine Vision: Network Architecture Design and Optimization
arXiv - CS - Multimedia Pub Date : 2021-07-01 , DOI: arxiv-2107.00328 Shurun Wang, Zhao Wang, Shiqi Wang, Yan Ye
arXiv - CS - Multimedia Pub Date : 2021-07-01 , DOI: arxiv-2107.00328 Shurun Wang, Zhao Wang, Shiqi Wang, Yan Ye
The research of visual signal compression has a long history. Fueled by deep
learning, exciting progress has been made recently. Despite achieving better
compression performance, existing end-to-end compression algorithms are still
designed towards better signal quality in terms of rate-distortion
optimization. In this paper, we show that the design and optimization of
network architecture could be further improved for compression towards machine
vision. We propose an inverted bottleneck structure for end-to-end compression
towards machine vision, which specifically accounts for efficient
representation of the semantic information. Moreover, we quest the capability
of optimization by incorporating the analytics accuracy into the optimization
process, and the optimality is further explored with generalized rate-accuracy
optimization in an iterative manner. We use object detection as a showcase for
end-to-end compression towards machine vision, and extensive experiments show
that the proposed scheme achieves significant BD-rate savings in terms of
analysis performance. Moreover, the promise of the scheme is also demonstrated
with strong generalization capability towards other machine vision tasks, due
to the enabling of signal-level reconstruction.
中文翻译:
面向机器视觉的端到端压缩:网络架构设计与优化
视觉信号压缩的研究历史悠久。在深度学习的推动下,最近取得了令人兴奋的进展。尽管实现了更好的压缩性能,但现有的端到端压缩算法在速率失真优化方面仍旨在实现更好的信号质量。在本文中,我们表明可以进一步改进网络架构的设计和优化,以实现面向机器视觉的压缩。我们提出了一种面向机器视觉的端到端压缩的倒置瓶颈结构,它专门考虑了语义信息的有效表示。此外,我们通过将分析准确性纳入优化过程来寻求优化能力,并且以迭代的方式通过广义速率-准确度优化进一步探索最优性。我们使用对象检测作为面向机器视觉的端到端压缩的展示,大量实验表明,所提出的方案在分析性能方面实现了显着的 BD-rate 节省。此外,由于启用了信号级重建,该方案还具有对其他机器视觉任务的强大泛化能力。
更新日期:2021-07-02
中文翻译:
面向机器视觉的端到端压缩:网络架构设计与优化
视觉信号压缩的研究历史悠久。在深度学习的推动下,最近取得了令人兴奋的进展。尽管实现了更好的压缩性能,但现有的端到端压缩算法在速率失真优化方面仍旨在实现更好的信号质量。在本文中,我们表明可以进一步改进网络架构的设计和优化,以实现面向机器视觉的压缩。我们提出了一种面向机器视觉的端到端压缩的倒置瓶颈结构,它专门考虑了语义信息的有效表示。此外,我们通过将分析准确性纳入优化过程来寻求优化能力,并且以迭代的方式通过广义速率-准确度优化进一步探索最优性。我们使用对象检测作为面向机器视觉的端到端压缩的展示,大量实验表明,所提出的方案在分析性能方面实现了显着的 BD-rate 节省。此外,由于启用了信号级重建,该方案还具有对其他机器视觉任务的强大泛化能力。