Skip to main content
Log in

Fast intra-coding unit partition decision in H.266/FVC based on deep learning

  • Special Issue Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

In the recent Future Video Coding (FVC) standard developed by the Joint Video Exploration Team (JVET), the quad-tree binary-tree (QTBT) block partition module makes use of rectangular block forms and additional square block sizes compared to quad-tree (QT) block partitioning module proposed in the predecessor High-Efficiency Video Coding (HEVC) standard. This block flexibility, induced with the QTBT module, significantly improves compression performance while it dramatically increases coding complexity due to the brute force search for Rate Distortion Optimization (RDO). To cope with this issue, it is necessary to consider the unique characteristics of QTBT in FVC. In this paper, we propose a fast QT partitioning algorithm based on a deep convolutional neural network (CNN) model to predict coding unit (CU) partition instead of RDO which enhances considerably QTBT performance for intra-mode coding. Based on a suitable diversified CU partition patterns database, the optimization process is set up with three levels CNN structure developed to learn the split or non-split decision from the established database. Experimental results reveal that the proposed algorithm can accelerate the QTBT block partition structure by reducing the intra-mode encoding time by an average of 35% with a bit rate increase of 1.7%, allowing its application in practical scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Gary, J.S., Jens-Rainer, O., Woo-Jin, H., Thomas, W.: Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circuits Syst. Video Technol. 22, 1649–1668 (2012)

    Article  Google Scholar 

  2. Huang, H., Zhang, K., Huang, Y.W., Lei, S.M.: EE2.1: Quadtree plus binary tree structure integration with JEM tools. Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Doc. JVET-C0024, 3rd Meeting, Geneva (2016)

  3. Gary, S., Jens-Rainer, O.: Meeting notes of the 2nd meeting of the Joint Video Exploration Team (JVET), document JVET-B\_Notes\_d7, San Diego (2016)

  4. Zhao, W., Shiqi, W., Xinfeng, Z., Shanshe, W., Siwei, M.: Fast QTBT partitioning decision for inter frame coding with convolutional neural network. In: International Conference on Image Processing, IEEE (2018)

  5. Ryu, S., Kang, J.: Machine learning-based fast angular prediction mode decision technique in video coding. IEEE Trans. Image Process. 27(11), 5525–5538 (2018)

    Article  MathSciNet  Google Scholar 

  6. Thomas, A., Alexandre, M., Wassim, H., Cyril, B., Daniel, M.: Random Forest oriented fast QTBT frame partitioning. In: International Conference on Acoustics, Speech and Signal Processing, IEEE (2019)

  7. Ting Lan, L., Hui Yu, J., Jing Ya, H., Pao Chi, C.: Fast intra coding unit partition decision in H.266/FVC based on spatial features. J. Real-Time Image Process. (2018)

  8. Wei, J., Hanjie, M., Yaowu, C.: Gradient based fast mode decision algorithm for intra prediction in HEVC. In: 2012 2nd International Conference on Consumer Electronics, Communications and Networks (CECNet), IEEE, pp. 1836–1840 (2012)

  9. Biao, M., Ray, C.C.C.: A ffast CU size decision algorithm for the HEVC intra encoder. IEEE Trans. Circuits Syst. Video Technol. 25(5), 892–896 (2015)

    Article  Google Scholar 

  10. Alexandre, M., Florian, A., Maxime, P., Wassim, H., Daniel, M.: Prediction of quad-tree partitioning for budgeted energy HEVC encoding. In: 2017 International Workshop on Signal Processing Systems (SiPS), IEEE, pp. 1–6 (2017)

  11. Fanyi, D., Zhan, M., Yao, W.: Fast CU partition decision using machine learning for screen content compression. In: 2015 International Conference on Image Processing (ICIP), IEEE, pp. 4972–4976 (2015)

  12. Zongju, P., Chao, H., Fen, C., Gangyi, J., Xin, C., Mei, Y.: Multiple classifier-based fast coding unit partition for intra coding in future video coding. Signal Process. Image Commun. 78, 171–179 (2019)

    Article  Google Scholar 

  13. Jin, Z., An, P., Yang, C., Shen, L.: Fast QTBT partition algorithm for intra frame coding through convolutional neural network. IEEE Access 6, 54660–54673 (2018)

    Article  Google Scholar 

  14. Jin, Z., An, P., Shen, L., Yang, C.: CNN oriented fast QTBT partition algorithm for JVET intra coding. In: 2017 IEEE Visual Communications and Image Processing (VCIP), pp. 1–4 (2017)

  15. Ian, J., G., Jean, P., Mehdi, M., Bing, X., David, W., Sherjil, O., Aaron, C., Yoshua, B.: Generative adversarial nets. In: Proc. Conf. Neural Inf. Process. Syst., Palais des Congrs de Montreal, Montreal, pp. 2672–2680 (2014)

  16. Shamsolmoali, P., Zareapoor, M., Wang, R., Jain, D.K., Yang, J.: G-GANISR: gradual generative adversarial network for image super resolution. Neurocomputing 366, 140–153 (2019)

    Article  Google Scholar 

  17. Shamsolmoali, P., Zhang, J., Yang, J.: Image super resolution by dilated dense progressive network. Image Vis. Comput. 88, 9–18 (2019)

    Article  Google Scholar 

  18. Linwei, Z., Sam, K., Yun, Z., Shiqi, W., Xu, W.: Generative Adversarial Network-Based Intra Prediction for Video Coding. IEEE transactions on multimedia 22(1) (2020)

  19. Bross, B.: Versatile video coding (Draft 2), Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC/ 29/WG 11, Tech. Rep. Doc. JVET-K1001-v7, Ljubljana, pp. 10–18 (2018)

  20. Jianle, C., Ying, C., Marta, K., Xiang, L., Hongbin, L., Li, Z., Xin, Z.: Coding Tools Investigation for Next Generation Video Coding, ITU-T SG16 Doc.COM16–C806 (2015)

  21. Marta, K., Jianle, C., Wei-Jung, C., Xiang, L., Amir, S., Li, Z., Xin, Z.: Study of Coding Efficiency Improvements Beyond HEVC, MPEG doc. m37102 (2015)

  22. Segall, A., Francois, E., Rusanovskyy, D., and al.: JVET common test conditions and evaluation procedures for HDR/WCG video [C]//Proceedings of the Joint Video Exploration Team 5th Meeting, JVET, Geneva (2017)

  23. Bjontegaard, G.: Calculation of average PSNR difference between RD-curves, ITU-T Q.6/SG16 VCEG 13th Meeting, DocumentVCEG-M33 (2001)

  24. Duc-Tien, D.-N., Cecilia, P., Valentina, C., Giulia, B.: RAISE: A raw images dataset for digital image forensics. In: Proc. 6th ACM Multimedia Syst. Conf., pp. 219–224 (2015)

  25. Tianyi, L., Mai, X., Xin, D., Zhenyu, G.: A deep convolutional neural network approach for complexity reduction on intra-mode HEVC. In: International Conference on Multimedia and Expo, IEEE, pp. 1255–1260 (2017)

  26. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  27. Geoffrey, E.H., Nitish, S., Alex, K., Ilya, S., Ruslan, R.S.: Improving neural networks by preventing coadaptation of feature detectors. Comput. Sci. 3(4), 212–223 (2012)

    Google Scholar 

  28. JVET Software Repository, [online]. https://jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/branches/HM-16.6-JEM-7.0-dev/

  29. Jill, B., Karsten, S., Xiang, L.: JVETJ1010: JVET common test conditions and software reference configurations (2018)

  30. JVETSoftwarRepositry [online]. https://jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/branchesHM-16.6-JEM-3.1 dev/

  31. Martin, A., Ashish, A., Paul, B., Eugene, B., Zhifeng, C., Craig, C., Greg, S.C., Andy, D., Jeffrey, D., Matthieu, D., et al.: Tensorflow: large-scale machine learning on heterogeneous distributed systems, arXiv preprint arXiv:1603.04467 (2016)

  32. Fernández, D.G., Del Barrio, A.A., Botella, G., Meyer-Baese, U., Meyer-Baese, A., Grecos, C.: Information fusion based techniques for HEVC. In: Real-Time Image and Video Processing 2017 10223, 102230M (2017)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maraoui Amna.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Amna, M., Imen, W., Ezahra, S.F. et al. Fast intra-coding unit partition decision in H.266/FVC based on deep learning. J Real-Time Image Proc 17, 1971–1981 (2020). https://doi.org/10.1007/s11554-020-00998-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-020-00998-5

Keywords

Navigation