Multi-stage all-zero block detection for HEVC coding using machine learning

https://doi.org/10.1016/j.jvcir.2020.102945Get rights and content

Abstract

Compared with deadzone hard-decision quantization (HDQ), rate-distortion optimized quantization (RDOQ) in HEVC brings non-negligible coding gain, however consumes considerable computations caused by exhaustive search over multiple candidates to determine optimal output level. Benefiting from efficient prediction in HEVC, transform blocks are frequently quantized to all zero, especially in small-size blocks. It is worthwhile to detect all zero block (AZB) for transform blocks to bypass subsequent computation-intensive RDOQ. Traditional thresholding based AZB detection algorithms are well-suited for deadzone quantized blocks, however miss partial optimal results in RDOQ and suffer from more or less accuracy degradation in RDOQ. This paper proposes a novel multi-stage AZB detection algorithm for RDOQ blocks with good tradeoff between complexity and accuracy. At the first stage, genuine all zero blocks (G_AZB) which are quantized to all zero both in HDQ and RDOQ are prejudged by comparison with conservative threshold determined by mathematical derivation for deadzone HDQ. At the second stage, an adaptive threshold model is built using adaptive deadzone offset by simulating the behavior patterns existing in RDOQ, aiming to further detect the pseudo AZB (P_AZB) which are quantized to all zero in RDOQ however not all zero in HDQ. At the final stage, machine learning based detection is proposed to classify the remaining “cunning” all zero blocks using eight distinguished RDO-related features, by which subtle working mechanism in RDOQ is leveraged. The experimental results demonstrate that the proposed algorithm achieves up to 7.471% total coding computation saving with 0.064% BD-RATE increment compared with RDOQ on average. Moreover, the average FNR and FPR detection accuracies are 6.3% and 6.5% respectively.

Introduction

High Efficiency Video Coding (HEVC) [1] achieves significant compression efficiency improvement compared with its predecessor, H.264/AVC [2]. The performance improvement is contributed by several new coding tools, including larger Coding Tree Unit (CTU), recursive quad-tree structured Coding Unit (CU) split, larger Prediction Unit (PU), larger Transform Unit (TU), and advanced intra and inter prediction modes. Rate Distortion Optimization (RDO) is widely adopted in HEVC reference model to optimize the algorithm customizable modules such as mode decision, rate control, transform and quantization etc. RDO mode decision selects optimal coding mode in the RDO sense from massive candidates. Simple and direct RDO can be implemented by minimizing the RD costs of all candidate modes. In order to calculate the RD costs, the encoder needs to activate calculations including transform (T), RDO quantization (RDOQ) [3], inverse quantization (IQ) and inverse transform (IT), resulting in extremely high computation complexity. As the inner component unit in the RDO loop, RDOQ delivers obvious compression efficiency improvement, however consumes considerable computations caused by exhaustive search over multiple candidates to determine the optimal quantized level in the RDO sense [4]. Benefiting from higher prediction accuracy as well as improved transform and quantization, most of the transform blocks (TU) are quantized into AZB, especially the blocks with small TU size such as 4 × 4 and 8 × 8 [5], [6]. If video encoder can smartly detect these AZBs without activating quantization and entropy, considerable computations can be saved especially for RDOQ based HEVC coding. It is meaningful to develop efficient AZB detection algorithms for HEVC encoder optimization.

Several fast AZB detection algorithms were reported for H.264 encoding [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20] and HEVC encoding [21], [22], [23], [24], [25]. In [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], these algorithms were designed for HDQ based encoders in which dead-zone HDQ was employed. These methods work well for HDQ base encoders, however are ill-suited for soft-decision quantization (SDQ, for example RDOQ) based encoders. In [21], [22], [23], [24], [25], SDQ was taken into consideration for AZB detection. These methods adaptively detect the AZBs according to either residual domain SAD or Hadamard SATD (sum of absolute difference). However, SAD and SATD only reflect the residual or difference in spatial or Hadamard domain, and rate term is not considered completely. In SDQ and RDOQ, the optimal level is decided by dynamic programming such as Viterbi trellis search over full pathes or partial pathes, and after competition only path with minimal rate distortion cost is kept. Rate and distortion are simultaneously considered in RDOQ to achieve RD optimization. Consequently, there is still room to develop more efficient AZB detection algorithms for RDOQ based encoders.

In terms of AZB detection, some AZBs are easily detected using simple thresholding, however some AZBs are not easily detected due to the complicated manipulation mechanism in RDOQ [26]. This paper tends to develop a multi-stage AZB detection algorithm for RDOQ based HEVC coding with good trade off between complexity saving and detection accuracy maintainance.

At the first stage, genuine all zero blocks (G_AZB) which are quantized to all zero both in HDQ and RDOQ are prejudged by comparison with conservative threshold which is determined by mathematical derivation in deadzone HDQ. At the second stage, an adaptive threshold model is built via using adaptive deadzone offset by simulating the behavior patterns in RDOQ, aiming to detect the pseudo AZB (P_AZB) which are quantized to all zero in RDOQ however not all zero in HDQ. Accounting for the complicated work mechanism in RDOQ, explicit mathematical solution of accurate AZB threshold may be not easily developed. At the final stage, machine learning based detection is proposed to classify the remaining “cunning” all zero blocks using eight distinguished features at different levels, including TU-level, coefficient-level and context syntax element-level, by which subtle working mechanism in RDOQ is fully leveraged. This binary classification model is trained and fine-tuned offline using machine learning.

The rest of this paper is organized as follows. Problem formulation and background analysis are given in Section 2. The proposed multi-stage AZB detection algorithm is given in Section 3. Section 4 gives the experimental results, and Section 5 concludes the whole paper.

Section snippets

HDQ and RDOQ

Video coding standard only stipulates the inverse quantization, and quantization is one algorithm customizable module in video encoders. The HEVC reference software offers two quantization algorithm options, deadzone HDQ and RDOQ. These two quantization options allow the users to strike different trade off between complexity and performance.

In HEVC, deadzone HDQ can be described as follows [27].|l(i,j)|=floor(|c(i,j)|q+f)=floor(|c(i,j)|·MQP/62Qbits+f)=(|c(i,j)|·MQP/6+f)Qbitsf=171(Qbits-9),

The proposed AZB detection algorithm

In this section, we propose a multi-stage AZB detection scheme for TU blocks with different sizes, thresholding based mathematical derivation and learning based empirical analysis are simultaneously employed. In HEVC reference model, RDOQ is alternatively enabled on default. In contrast with HDQ, RDOQ aims to determine the optimal quantized coefficient levels in the RDO sense, and non-zero HDQ quantized blocks may become all zero ones in RDOQ. From this perspective, the blocks are categorized

Machine learning based P_AZB detection

Given the G_AZB and P_AZB detection schemes described above, there are still cunning zero quantized TUs missed due to the complicated rounding in RDOQ. Thus, we need to develop more intelligent detection algorithm to handle these cunning blocks, by fully taking complicated context coding in CABAC and dynamic programming into consideration.

Simulation results and analysis

This paper proposes a multi-stage AZB detection algorithm using machine learning for RDOQ based HEVC coding. We had analyzed the histogram results of three types of samples that are respectively AZB classified by scheme 1(G_AZB detection), scheme 2 (P_AZB detection) and scheme 3 (Learning based detection) respectively. The results are shown in Fig. 14. Here, the occurring averages of three detection algorithms of six test sequences are shown with three horizontal red lines, in addition QP = 22

Conclusion

In this paper, a multi-stage AZB detection algorithm with adaptive threshold and machine learning classification is proposed. At the first stage, G_AZB is decided according to the coefficient level using HDQ based adaptive classification threshold. At the second stage, an adaptive threshold model is built based on QP, aiming to detect the P_AZB further. At the final stage, machine learning is applied to develop the detection classification model using hierarchical level features, ranging from

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This work was supported in part by NSFC 61972123, 61931008 and 61901150, ZJNSF LY19F020043, as well as Key RD project 2018YFC0830106.

References (32)

  • H. Wang et al.

    Early detection of all-zero 4×4 blocks in high efficiency video coding

    J. Vis. Commun. Image Represent.

    (2014)
  • G.J. Sullivan et al.

    Overview of the high efficiency video coding (hevc) standard

    IEEE Trans. Circ. Syst. Video Technol.

    (2012)
  • T. Wiegand et al.

    Overview of the h. 264/avc video coding standard

    IEEE Trans. Circ. Syst. Video Technol.

    (2003)
  • K.H. Yang, Methods and systems for rate-distortion optimized quantization of transform blocks in block transform video...
  • E.H. Yang et al.

    Rate distortion optimization for h.264 interframe coding: a general framework and algorithms

    IEEE Trans. Image Process. Publ. IEEE Signal Process. Soc.

    (2007)
  • T.-Y. Huang et al.

    Acceleration of rate-distortion optimized quantization for h. 264/avc

  • H.B. Yin et al.

    Fast soft decision quantization with adaptive preselection and dynamic trellis graph

    IEEE Trans. Circ. Syst. Video Technol.

    (2015)
  • Z. Xuan et al.

    Method for detecting all-zero dct coefficients ahead of discrete cosine transformation and quantisation

    Electron. Lett.

    (2002)
  • L.A. Sousa

    General method for eliminating redundant computations in video coding

    Electron. Lett.

    (2000)
  • H.M. Yong et al.

    An improved early detection algorithm for all-zero blocks in h.264 video encoding

    IEEE Trans. Circ. Syst. Video Technol.

    (2005)
  • I.M. Pao et al.

    Modeling dct coefficients for fast video encoding

    IEEE Trans. Circ. Syst. Video Technol.

    (1999)
  • H. Wang et al.

    Efficient prediction algorithm of integer dct coefficients for h.264/avc optimization

    IEEE Trans. Circ. Syst. Video Technol.

    (2006)
  • M. Zhang et al.

    Adaptive method for early detecting zero quantized dct coefficients in h.264/avc video encoding

    IEEE Trans. Circ. Syst. Video Technol.

    (2009)
  • X. Ji et al.

    Early determination of zero-quantized 8 dct coefficients

    IEEE Trans. Circ. Syst. Video Technol.

    (2009)
  • Z. Xie et al.

    A general method for detecting all-zero blocks prior to dct and quantization

    IEEE Trans. Circ. Syst. Video Technol.

    (2007)
  • Y.W. Huang et al.

    Analysis and complexity reduction of multiple reference frames motion estimation in h.264/avc

    IEEE Trans. Circ. Syst. Video Technol.

    (2006)
  • Cited by (5)

    This paper has been recommended for acceptance by Zicheng Liu.

    View full text