当前位置: X-MOL 学术Int. J. Circ. Theory Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An FPGA comparative study of high‐level and low‐level combined designs for HEVC intra, inverse quantization, and IDCT/IDST 2D modules
International Journal of Circuit Theory and Applications ( IF 2.3 ) Pub Date : 2020-04-14 , DOI: 10.1002/cta.2790
Ahmed Ben Atitallah 1, 2 , Manel Kammoun 2, 3 , Karim M.A. Ali 4 , Rabie Ben Atitallah 5
Affiliation  

Two main design methods are currently widely adopted in dealing with complex signal processing algorithms. The first method is based on low‐level synthesis (LLS), which consists in writing the hardware description languages (HDL) code manually. However, the second method, called high‐level synthesis (HLS), generates the register transfer level (RTL) description automatically starting from a high‐level description language. The challenge in this paper was to study the impact of both design methods on such a complex application as the High Efficiency Video Coding (HEVC) decoder. With this end in view, we analyzed the complexity of the HEVC decoder in a software environment using version 10 of the HEVC test model (HM) reference software to determine which portions tended to get optimized. The combined architecture for the intra prediction (IP) and the inverse quantization and transform (IQ/IT) was then implemented in hardware using HLS and LLS. The findings obtained under the Xilinx Zynq 7045‐based field‐programmable gate array (FPGA) proved that the HLS implementation enabled a gain of about 80% in Look Up Table (LUTs) with an increase of 93% in DSP blocks compared with LLS implementation. Yet only the LLS solution could achieve the real‐time decoding of 4K@26fps instead of the 1080p@24fps by the HLS design.

中文翻译:

针对HEVC帧内,逆量化和IDCT / IDST 2D模块的高级和低级组合设计的FPGA比较研究

当前,在处理复杂的信号处理算法时,广泛采用了两种主要的设计方法。第一种方法基于低级综合(LLS),该方法包括手动编写硬件描述语言(HDL)代码。但是,第二种方法称为高级综合(HLS),它会从高级描述语言自动生成寄存器传输级别(RTL)描述。本文面临的挑战是研究这两种设计方法对像高效视频编码(HEVC)解码器这样的复杂应用的影响。有鉴于此,我们使用HEVC测试模型(HM)参考软件的版本10分析了软件环境中HEVC解码器的复杂性,以确定哪些部分倾向于优化。然后,使用HLS和LLS在硬件中实现了帧内预测(IP)和逆量化和变换(IQ / IT)的组合体系结构。在基于Xilinx Zynq 7045的现场可编程门阵列(FPGA)下获得的发现证明,HLS实施使查找表(LUT)的增益提高了约80%,而DSP块与LLS实施相比却提高了93% 。但是,只有LLS解决方案才能通过HLS设计实现4K @ 26fps的实时解码,而不是1080p @ 24fps的实时解码。在基于Xilinx Zynq 7045的现场可编程门阵列(FPGA)上获得的发现证明,与LLS实施相比,HLS实施在查找表(LUT)中实现了约80%的增益,而DSP块则增加了93%。 。但是,只有LLS解决方案才能通过HLS设计实现4K @ 26fps的实时解码,而不是1080p @ 24fps的实时解码。在基于Xilinx Zynq 7045的现场可编程门阵列(FPGA)上获得的发现证明,与LLS实施相比,HLS实施在查找表(LUT)中实现了约80%的增益,而DSP块则增加了93%。 。但是,只有LLS解决方案才能通过HLS设计实现4K @ 26fps的实时解码,而不是1080p @ 24fps的实时解码。
更新日期:2020-04-14
down
wechat
bug