End-to-End Learnt Image Compression via Non-Local Attention Optimization and Improved Context Modeling,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

End-to-End Learnt Image Compression via Non-Local Attention Optimization and Improved Context Modeling
IEEE Transactions on Image Processing ( IF 10.6 ) Pub Date : 2021-02-19 , DOI: 10.1109/tip.2021.3058615
Tong Chen , Haojie Liu , Zhan Ma , Qiu Shen , Xun Cao , Yao Wang

This article proposes an end-to-end learnt lossy image compression approach, which is built on top of the deep nerual network (DNN)-based variational auto-encoder (VAE) structure with Non-Local Attention optimization and Improved Context modeling (NLAIC). Our NLAIC 1) embeds non-local network operations as non-linear transforms in both main and hyper coders for deriving respective latent features and hyperpriors by exploiting both local and global correlations, 2) applies attention mechanism to generate implicit masks that are used to weigh the features for adaptive bit allocation, and 3) implements the improved conditional entropy modeling of latent features using joint 3D convolutional neural network (CNN)-based autoregressive contexts and hyperpriors. Towards the practical application, additional enhancements are also introduced to speed up the computational processing (e.g., parallel 3D CNN-based context prediction), decrease the memory consumption (e.g., sparse non-local processing) and reduce the implementation complexity (e.g., a unified model for variable rates without re-training). The proposed model outperforms existing learnt and conventional (e.g., BPG, JPEG2000, JPEG) image compression methods, on both Kodak and Tecnick datasets with the state-of-the-art compression efficiency, for both PSNR and MS-SSIM quality measurements. We have made all materials publicly accessible at https://njuvision.github.io/NIC for reproducible research.

中文翻译：

通过非本地注意力优化和改进的上下文建模进行的端到端学习图像压缩

本文提出了一种端到端的学习有损图像压缩方法，该方法基于具有非本地注意力优化和改进的上下文建模（NLAIC）的基于深度神经网络（DNN）的变分自动编码器（VAE）结构之上）。我们的NLAIC 1）将非本地网络操作作为非线性变换嵌入到主编码器和超编码器中，以通过利用本地和全局相关性来推导各自的潜在特征和超优先级； 2）应用注意力机制生成用于加权的隐式掩码3）使用基于联合3D卷积神经网络（CNN）的自回归上下文和超优先级实现潜在特征的改进条件熵建模。为了实际应用，还引入了其他增强功能来加速计算处理（例如，基于并行3D CNN的上下文预测），减少内存消耗（例如，稀疏的非本地处理）并降低实现复杂性（例如，可变速率的统一模型）无需重新培训）。对于PSNR和MS-SSIM质量测量，所提出的模型在Kodak和Tecnick数据集上均具有最先进的压缩效率，胜过现有的学习方法和常规（例如BPG，JPEG2000，JPEG）图像压缩方法。我们已在以下位置公开提供了所有资料对于PSNR和MS-SSIM质量测量，所提出的模型在Kodak和Tecnick数据集上均具有最先进的压缩效率，胜过现有的学习方法和常规（例如BPG，JPEG2000，JPEG）图像压缩方法。我们已在以下位置公开提供了所有资料对于PSNR和MS-SSIM质量测量，所提出的模型在Kodak和Tecnick数据集上均具有最先进的压缩效率，胜过现有的学习方法和常规（例如BPG，JPEG2000，JPEG）图像压缩方法。我们已在以下位置公开提供了所有资料https://njuvision.github.io/NIC 用于可重复的研究。

更新日期：2021-02-26

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>