Skip to main content

Advertisement

Log in

UHD 8K energy-quality scalable HEVC intra-prediction SAD unit hardware using optimized and configurable imprecise adders

  • Original Research Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

Real-time digital video coding became a mandatory feature in current consumer electronic devices due to the popularization of video applications. However, efficiently encoding videos is an extremely processing/energy-demanding task, especially at high resolutions and frame rates. Thus, the limited energy resources and the dynamically varying system status (such as workload, battery level, user settings, etc.) require energy-efficient solutions capable to support run-time energy-quality scalability. In this work, we present an energy-quality scalable SAD Unit hardware architecture for the HEVC intra-frame prediction targeting real-time processing of UHD 8K (7680 × 4320) videos at 60 frames per second. Approximate computing is used to provide energy-quality scalability by employing configurable imprecise operators. The proposed Energy-Quality scalable architecture supports four operation points: precise computing, and 3-bit, 5-bit or 7-bit imprecision. When implemented in a 45-nm technology using Nangate standard cells library and running at 269 MHz, the proposed architecture consumes from 8.42 to 7.38 mJ to process each UHD 8K frame, according to the selected imprecision level. As a drawback, the coding efficiency (measured in BD rate) is reduced from 0.28 to 1.72%. Compared to the related works, this is the only intra-frame prediction SAD unit able to provide energy-quality scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Cisco Visual Networking Index: Forecast and Trends, 2017–2022. Cisco Systems. San Jose, USA [Online]. https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/white-paper-c11-741490.html. Accessed 23 Apr 2019

  2. Information Technology.: High efficiency coding and media delivery in heterogeneous environments—part 2: high efficiency video coding, ISO/IEC 23008-2 (2013)

  3. Series H.: Audiovisual and multimedia systems infrastructure of audio-visual services–advanced coding of moving video advanced video coding for generic audiovisual services, recommendation ITU-T H.264 (06/2011), (2011)

  4. Correa, G., Assuncao, P., Agostini, L., Cruz, L.: Performance and computational complexity assessment of high-efficiency video encoders. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1899–1909 (2012). https://doi.org/10.1109/TCSVT.2012.2223411

    Article  Google Scholar 

  5. Alcocer, E., Gutierez, R., Lopez-Granado, O., Malumbres, M.: Design and implementation of an efficient hardware integer motion estimator for an HEVC video encoder. J. Real Time Image Proc. 16(2), 547–557 (2019). https://doi.org/10.1007/s11554-016-0572-4

    Article  Google Scholar 

  6. Lung, C.-Y., Shen, C.-A.: Design and implementation of a highly efficient fractional motion estimation for the HEVC encoder. J. Real Time Image Process. 16, 1–17 (2016). https://doi.org/10.1007/s11554-016-0663-2

    Article  Google Scholar 

  7. Paim, G., Penny, W., Goebel, J., Afonso, V., Susin, A., Porto, M., Zatt, B., Agostini, L.: An efficient sub-sample interpolator hardware for VP9-10 standards. In: IEEE International Conference on Image Processing, pp. 2167–2171. Phoenix, USA (2016). https://doi.org/10.1109/icip.2016.7532742

  8. Liu, C., Shen, W., Ma, T., Fan, Y., Zeng, X.: A highly pipelined VLSI architecture for all modes and block sizes intra prediction in HEVC encoder. In: IEEE International Conference on ASIC, pp. 1–4. Shenzhen, China (2013). https://doi.org/10.1109/asicon.2013.6811849

  9. Zhou, N., Ding, D., Yu, L.: On hardware architecture and processing order of HEVC intra prediction module. In: Picture Coding Symposium, pp. 101–104. San Jose, USA (2013). https://doi.org/10.1109/pcs.2013.6737693

  10. Palomino, D., Sampaio, F., Agostini, L., Bampi, S., Susin, A.: A memory aware and multiplierless VLSI architecture for the complete intra prediction of the HEVC emerging standard. In: IEEE International Conference on Image Processing, pp. 201–204. Lake Buena Vista, USA (2012). https://doi.org/10.1109/icip.2012.6466830

  11. Jridi, M., Alfalou, A., Meher, P.: Efficient approximate core transform and its reconfigurable architectures for HEVC. J. Real Time Image Process. (2018). https://doi.org/10.1007/s11554-018-0768-x

    Article  Google Scholar 

  12. Braatz, L., Agostini, L., Zatt, B., Porto, M.: A multiplierless parallel HEVC quantization hardware for real-time UHD 8K video coding. In: IEEE International Conference on Circuits and Systems, pp. 1–4. Baltimore, USA (2017). https://doi.org/10.1109/iscas.2017.8050704

  13. Goebel, J., Paim, G., Agostini, L., Zatt, B., Porto, M.: An HEVC multi-size DCT hardware with constant throughput and supporting heterogeneous CUs. In: IEEE International Conference on Circuits and Systems, pp. 2202–2205. Montreal, Canada (2016). https://doi.org/10.1109/iscas.2016.7539019

  14. Jo, H., Park, S., Sim, D.: Parallelized deblocking filtering of HEVC decoders based on complexity estimation. J. Real Time Image Proc. 12(2), 369–382 (2016). https://doi.org/10.1007/s11554-015-0556-9

    Article  Google Scholar 

  15. Shen, W., Fan, Y., Bai, Y., Huang, L., Shang, Q., Liu, C., Zeng, X.: A combined deblocking filter and SAO hardware architecture for HEVC. IEEE Trans. Multimed. 18(6), 1022–1033 (2016). https://doi.org/10.1109/TMM.2016.2532606

    Article  Google Scholar 

  16. Rediess, F., Agostini, L., Cristani, C., Dall’Oglio, P., Porto, M.: High throughput hardware design for the adaptive loop filter of the emerging HEVC video coding. In: Symposium on Integrated Circuits and Systems Design, pp. 1–5. Brasília, Brazil (2012). https://doi.org/10.1109/sbcci.2012.6344446

  17. Choi, J.-A., Ho, Y.-S.: High throughput entropy coding in the HEVC standard. J. Signal Process. Syst. 81(1), 59–69 (2015). https://doi.org/10.1007/s11265-014-0900-5

    Article  Google Scholar 

  18. Sun, H., Zhou, L., Xu, H., Sun, T., Wang, Y.: A high-efficiency HEVC entropy decoding hardware architecture. In: International Conference on Advanced Communication Technology (ICACT), pp. 186–190. Seoul, South Korea (2015). https://doi.org/10.1109/icact.2015.7224781

  19. Ramos, F., Goebel, J., Zatt, B., Porto, M., Bampi, S.: Low-power hardware design for the HEVC binary arithmetic encoder targeting 8K videos. In: Symposium on Integrated Circuits and Systems Design, pp. 1–6. Belo Horizonte, Brazil (2016). https://doi.org/10.1109/sbcci.2016.7724044

  20. Afonso, V., Maich, H., Agostini, L., Franco, D.: Low cost and high throughput FME interpolation for the HEVC emerging video coding standard. In: Latin American Symposium on Circuits and Systems, pp. 1–4. Cusco, Peru (2013). https://doi.org/10.1109/lascas.2013.6519017

  21. He, G., et al.: High-throughput power-efficient VLSI architecture of fractional motion estimation for ultra-HD HEVC video encoding. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 23(12), 3138–3142 (2015). https://doi.org/10.1109/tvlsi.2014.2386897

    Article  Google Scholar 

  22. He, Z., Tsui, C., Chan, K., Liou, M.: Low-power VLSI design for motion estimation using adaptive pixel truncation. IEEE Trans. Circuits Syst. Video Technol. 10(5), 669–678 (2000). https://doi.org/10.1109/76.856445

    Article  Google Scholar 

  23. Yang, Y., Zheng, J.: Edge-guided depth map resampling for HEVC 3D video coding. In: International Conference on Virtual Reality and Visualization, pp. 132–137. Xi’an, China (2013). https://doi.org/10.1109/icvrv.2013.29

  24. Masera, M., Martina, M., Masera, G.: Adaptive approximated DCT architectures for HEVC. IEEE Trans. Circuits Syst. Video Technol. 27(12), 2714–2725 (2017). https://doi.org/10.1109/tcsvt.2016.2595320

    Article  Google Scholar 

  25. El-Harouni, W., et al.: Embracing approximate computing for energy-efficient motion estimation in high efficiency video coding. In: Design, Automation and Test in Europe Conference and Exhibition (DATE), pp. 1384–1389. Lausanne, Switzerland (2017). https://doi.org/10.23919/date.2017.7927209

  26. Porto, R., Agostini, L., Zatt, B., Porto, M., Roma, N., Sousa, L.: Energy-efficient motion estimation with approximate arithmetic. In: International Workshop on Multimedia Signal Processing, pp. 1–6. Luton, UK (2017). https://doi.org/10.1109/mmsp.2017.8122248

  27. Bjontegaard, G.: Calculation of average PSNR differences between RD-curves. In: Document VCEG-M33. ITU—Telecommunications Standardization Sector—STUDY GROUP 16 Question 6—Video Coding Experts Group (VCEG) (2001). http://wftp3.itu.int/av-arch/video-site/0104_Aus/VCEG-M33.doc. Accessed 29 Mar 2019

  28. Raha, A., Jayakumar, H., Raghunathan, V.: A power efficient video encoder using reconfigurable approximate arithmetic units. In: International Conference on VLSI Design and 2014 13th International Conference on Embedded Systems, pp. 324–329. Mumbai, India (2014). https://doi.org/10.1109/vlsid.2014.62

  29. Jridi, M., Meher, P.: Scalable approximate DCT architectures for efficient HEVC-compliant video coding. IEEE Trans. Circuits Syst. Video Technol. 27(8), 1815–1825 (2017). https://doi.org/10.1109/tcsvt.2016.2556578

    Article  Google Scholar 

  30. Lainema, J., Bossen, F., Han, W., Min, J., Ugur, K.: Intra coding of the HEVC standard. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1792–1801 (2012). https://doi.org/10.1109/tcsvt.2012.2221525

    Article  Google Scholar 

  31. Corrêa, M., Zatt, B., Porto, M., Agostini, L.: High-throughput HEVC intrapicture prediction hardware design targeting UHD 8K videos. In: IEEE International Symposium on Circuits and Systems, pp. 1–4. Baltimore, USA (2017). https://doi.org/10.1109/iscas.2017.8050702

  32. Wien, M.: High Efficiency Video Coding: Coding Tools and Specification, pp. 63–65. Springer, New York (2014)

    Google Scholar 

  33. Bossen, F.: Common test conditions and software reference configurations. In: “Document JCTVC-L1100 of JCT-VC”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, Jan. 23 (2013). http://phenix.it-sudparis.eu/jct/doc_end_user/current_document.php?id=7281. Accessed 29 Mar 2019

  34. “HEVC Reference Software”. Fraunhofer Heinrich Hertz Institute. Berlin, Germany [Online]. https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/ Accessed 23 Apr 2019

  35. Sullivan, G., Ohm, J., Han, W., Wiegand, T.: Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1649–1668 (2012). https://doi.org/10.1109/TCSVT.2012.2221191

    Article  Google Scholar 

  36. Zhou, J., Zhou, D., Sun, H., Goto, S.: VLSI architecture of HEVC intra prediction for 8K UHDTV applications. In: IEEE International Conference on Image Processing, pp. 1273–1277. Paris, France (2014). https://doi.org/10.1109/icip.2014.7025254

  37. Piao, Y., Min, J., Chen, J.: Encoder improvement of unified intra prediction. In: “Document JCTVC-C207”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, Oct. (2010). https://phenix.int-evry.fr/jct/doc_end_user/documents/3_Guangzhou/wg11/JCTVC-C207-m18245-v2-JCTVC-C207.zip. Accessed 29 Mar 2019

  38. Kahng, A., Kang, S.: Accuracy-configurable adder for approximate arithmetic designs. In: ACM/EDAC/IEEE Annual Design Automation Conference, pp. 820–825. San Francisco, USA (2012). https://doi.org/10.1145/2228360.2228509

  39. Camus, V., Schlachter, J., Enz, C.: A low-power carry cut-back approximate adder with fixed-point implementation and floating-point precision. In: ACM/EDAC/IEEE Design Automation Conference, pp. 1–6. Austin, USA (2016). https://doi.org/10.1145/2897937.2897964

  40. Zhu, N., Goh, W., Zhang, W., Yeo, K., Kong, Z.: Design of low-power high-speed truncation-error-tolerant adder and its application in digital signal processing. IEEE Trans. Very Large Scale Int. Syst. 18(8), 1225–1229 (2010). https://doi.org/10.1109/tvlsi.2009.2020591

    Article  Google Scholar 

  41. Zhu, N., Goh, W., Wang, G., Yeo, K.: Enhanced low-power high-speed adder for error-tolerant application. In: IEEE International SOC Design Conference, pp. 323–327. Incheon, South Korea (2010). https://doi.org/10.1109/socdc.2010.5682905

  42. Shafique, M., Ahmad, W., Hafiz, R., Henkel, J.: A low latency generic accuracy configurable adder. In: ACM/EDAC/IEEE Design Automation Conference, pp. 1–6. San Francisco, USA (2015). https://doi.org/10.1145/2744769.2744778

  43. Mahdiani, H.R., Ahmadi, A., Fakhraie, S.M., Lucas, C.: Bio-inspired imprecise computational blocks for efficient VLSI implementation of soft-computing applications. IEEE Trans. Circuits Syst. I Reg. Pap. 57(4), 850–862 (2010). https://doi.org/10.1109/tcsi.2009.2027626

    Article  MathSciNet  Google Scholar 

  44. Desoete, B., De Vos Alexis, A.: A reversible carry-look-ahead adder using control gates. Integr. VLSI J. 33(1), 89–104 (2002)

    Article  Google Scholar 

  45. Banerjee, N., et al.: Novel low-overhead operand isolation techniques for low-power datapath synthesis. In: Computer Design: VLSI in Computers and Processors, 2005. ICCD 2005. Proceedings. 2005 IEEE International Conference on IEEE (2005). https://doi.org/10.1109/iccd.2005.80

  46. NanGate FreePDK45 Open Cell Library, Nangate [Online]. http://www.nangate.com/?page_id=2325. Accessed 29 Mar 2019

  47. Zhou, D., et al.: 14.7 A 4G pixel/s 8/10b H.265/HEVC video decoder chip for 8K ultra HD applications. In: 2016 IEEE International Solid-State Circuits Conference (ISSCC), IEEE (2016). https://doi.org/10.1109/ISSCC.2016.7418009

  48. Chuang, T.-D., et al.: A 59.5 mW scalable/multi-view video decoder chip for quad/3D full HDTV and video streaming applications. In: 2010 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), IEEE (2010). https://doi.org/10.1109/ISSCC.2010.5433908

  49. Huang, C.-T., et al.: A 249 M pixel/s HEVC video-decoder chip for Quad Full HD applications. In: 2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), IEEE (2013). https://doi.org/10.1109/ISSCC.2013.6487682

  50. Tsai, C.-H., et al.: A 446.6 K-gates 0.55–1.2 V H. 265/HEVC decoder for next generation video applications. In: 2013 IEEE Asian Solid-State Circuits Conference (A-SSCC), IEEE (2013). https://doi.org/10.1109/ASSCC.2013.6691043

  51. Ju, C.-C., et al.: A 0.2 nJ/pixel 4K 60 fps Main-10 HEVC decoder with multi-format capabilities for UHD-TV applications. In: ESSCIRC 2014-40th European Solid State Circuits Conference (ESSCIRC), IEEE (2014). https://doi.org/10.1109/esscirc.2014.6942055

  52. Fang, H., Chen, H., Chang, T.: Fast intra prediction algorithm and design for high efficiency video coding. In: IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1770–1773. Montreal, Canada (2016). https://doi.org/10.1109/iscas.2016.7538911

  53. Lu, W., Yu, N., Nan, J., Wang, D.: A hardware structure of HEVC intra prediction. In: 2015 2nd International Conference on Information Science and Control Engineering, pp. 555–559. Shanghai, China (2015). https://doi.org/10.1109/icisce.2015.129

  54. Liu, Z., Wang, D., Zhu, H., Huang, X.: 41.7 BN-pixels/s reconfigurable intra prediction architecture for HEVC 2560 × 1600 encoder. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2634–2638. Vancouver, Canada (2013). https://doi.org/10.1109/icassp.2013.6638133

  55. Khan, M., Shafique, M., Grellert, M., Henkel, J.: Hardware-software collaborative complexity reduction scheme for the emerging HEVC intra encoder. In: Proceedings of the conference on design, automation and test in Europe, pp. 125–128. Grenoble, France (2013). https://doi.org/10.7873/date.2013.039

  56. Li, F., Shi, G., Wu, F.: An efficient VLSI architecture for 4 × 4 intra prediction in the High Efficiency Video Coding (HEVC) standard. In: 2011 18th IEEE International Conference on Image Processing, pp. 373–376. Brussels, Belgium (2011). https://doi.org/10.1109/icip.2011.6116526

  57. Vanne, J., et al.: A high-performance sum of absolute difference implementation for motion estimation. IEEE Trans. Circuits Syst. Video Technol. 16(7), 876–883 (2006). https://doi.org/10.1109/TCSVT.2006.877150

    Article  Google Scholar 

  58. Yufei, L., Xiubo, F., Qin, W.: A high-performance low cost SAD architecture for video coding. IEEE Trans. Consum. Electron. 53(2), 535–541 (2007). https://doi.org/10.1109/TCE.2007.381726

    Article  Google Scholar 

  59. Liu, Z., et al.: Hardware-efficient propagate partial sad architecture for variable block size motion estimation in H. 264/AVC. In: Proceedings of the 17th ACM Great Lakes symposium on VLSI, pp. 160–163. ACM (2007). https://doi.org/10.1145/1228784.1228826

Download references

Acknowledgements

This work is partly financed by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brasil (CAPES) Finance Code 001, by FCT projects PTDC/EEI-HAC/30485/2017 and UID/CEC/50021/2019, and also by CNPq and FAPERGS Brazilian research support agencies.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roger Porto.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Porto, R., Correa, M., Goebel, J. et al. UHD 8K energy-quality scalable HEVC intra-prediction SAD unit hardware using optimized and configurable imprecise adders. J Real-Time Image Proc 17, 1685–1701 (2020). https://doi.org/10.1007/s11554-019-00934-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-019-00934-2

Keywords

Navigation