Low bit-rate SNR scalable video coding based on overcomplete dictionary learning and sparse representation,Multidimensional Systems and Signal Processing

当前位置： X-MOL 学术 › Multidimens. Syst. Signal Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Low bit-rate SNR scalable video coding based on overcomplete dictionary learning and sparse representation
Multidimensional Systems and Signal Processing ( IF 1.7 ) Pub Date : 2019-07-25 , DOI: 10.1007/s11045-019-00671-6
Maziar Irannejad , Homayoun Mahdavi-Nasab

In the current decade, scalability has been developed in video coding (VC) schemes to reply end-user demands and heterogeneity of networks. In this paper, a low bit-rate signal-to-noise ratio (SNR) scalable VC based on dictionary learning (DL) and sparse representation is proposed. A notable feature of SNR scalability compared to spatial and temporal versions is that there are not any limitations in the number of enhancement layers, making it more applicable to adapt to different conditions. In this research, unlike traditional VC in which the discrete cosine transform (DCT) coefficients of video signals are quantized to obtain different SNR qualities, sparse codes are applied. Sparse coding is done over trained overcomplete dictionaries, for which three different DL algorithms, namely MOD, K-SVD, and RLS-DLA, are utilized and compared. The dictionaries are trained over the DCT domain of general natural images, to achieve higher compression and prevent blocking artifacts. The results of the proposed method are compared with non-scalable coding based on DL, and scalable and non-scalable coding schemes based on complete DCT dictionary employed in traditional VC standards such as MPEG.X and H.26X. The results show that, although video scalability naturally decreases the quality compared to non-scalable coding, the proposed scheme presents superior subjective and rate–distortion performance compared to non-scalable and scalable VC based on the traditional DCT quantization. Moreover, among the three DL methods applied, RLS-DLA achieves superior results both for non-scalable and scalable VC.

中文翻译：

基于过完备字典学习和稀疏表示的低比特率信噪比可伸缩视频编码

在当前的十年中，视频编码 (VC) 方案已经开发了可扩展性，以响应最终用户的需求和网络的异构性。在本文中，提出了一种基于字典学习（DL）和稀疏表示的低比特率信噪比（SNR）可扩展VC。与空间和时间版本相比，SNR 可扩展性的一个显着特点是增强层的数量没有任何限制，使其更适用于适应不同的条件。在这项研究中，与传统的 VC 不同，在传统 VC 中，视频信号的离散余弦变换 (DCT) 系数被量化以获得不同的 SNR 质量，应用了稀疏编码。稀疏编码是在经过训练的过完备字典上完成的，为此使用并比较了三种不同的 DL 算法，即 MOD、K-SVD 和 RLS-DLA。字典在一般自然图像的 DCT 域上进行训练，以实现更高的压缩率并防止块伪影。将所提出的方法的结果与基于DL的不可伸缩编码以及基于完整DCT字典的可伸缩和不可伸缩编码方案进行了比较，这些方案在传统VC标准如MPEG.X和H.26X中采用。结果表明，虽然视频可伸缩性与不可伸缩编码相比自然会降低质量，但与基于传统 DCT 量化的不可伸缩和可伸缩 VC 相比，所提出的方案呈现出优越的主观和率失真性能。此外，在所应用的三种 DL 方法中，RLS-DLA 在不可扩展和可扩展 VC 方面均取得了优异的结果。以实现更高的压缩并防止块状伪影。将所提出的方法的结果与基于DL的不可伸缩编码以及基于完整DCT字典的可伸缩和不可伸缩编码方案进行了比较，这些方案在传统VC标准如MPEG.X和H.26X中采用。结果表明，虽然视频可伸缩性与不可伸缩编码相比自然会降低质量，但与基于传统 DCT 量化的不可伸缩和可伸缩 VC 相比，所提出的方案呈现出优越的主观和率失真性能。此外，在所应用的三种 DL 方法中，RLS-DLA 在不可扩展和可扩展 VC 方面均取得了优异的结果。以实现更高的压缩并防止块状伪影。将所提出的方法的结果与基于DL的不可伸缩编码以及基于完整DCT字典的可伸缩和不可伸缩编码方案进行了比较，这些方案在传统VC标准如MPEG.X和H.26X中采用。结果表明，虽然视频可伸缩性与不可伸缩编码相比自然会降低质量，但与基于传统 DCT 量化的不可伸缩和可伸缩 VC 相比，所提出的方案呈现出优越的主观和率失真性能。此外，在所应用的三种 DL 方法中，RLS-DLA 在不可扩展和可扩展 VC 方面均取得了优异的结果。MPEG.X 和 H.26X 等传统 VC 标准中采用的基于完整 DCT 字典的可伸缩和不可伸缩编码方案。结果表明，虽然视频可伸缩性与不可伸缩编码相比自然会降低质量，但与基于传统 DCT 量化的不可伸缩和可伸缩 VC 相比，所提出的方案呈现出优越的主观和率失真性能。此外，在所应用的三种 DL 方法中，RLS-DLA 在不可扩展和可扩展 VC 方面均取得了优异的结果。MPEG.X 和 H.26X 等传统 VC 标准中采用的基于完整 DCT 字典的可伸缩和不可伸缩编码方案。结果表明，虽然视频可伸缩性与不可伸缩编码相比自然会降低质量，但与基于传统 DCT 量化的不可伸缩和可伸缩 VC 相比，所提出的方案呈现出优越的主观和率失真性能。此外，在所应用的三种 DL 方法中，RLS-DLA 在不可扩展和可扩展 VC 方面均取得了优异的结果。与基于传统 DCT 量化的不可伸缩和可伸缩 VC 相比，所提出的方案具有优越的主观和率失真性能。此外，在所应用的三种 DL 方法中，RLS-DLA 在不可扩展和可扩展 VC 方面均取得了优异的结果。与基于传统 DCT 量化的不可伸缩和可伸缩 VC 相比，所提出的方案具有优越的主观和率失真性能。此外，在所应用的三种 DL 方法中，RLS-DLA 在不可扩展和可扩展 VC 方面均取得了优异的结果。

更新日期：2019-07-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11