当前位置: X-MOL 学术IEEE Trans. Circuits Syst. I Regul. Pap. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Generative Multi-Symbol Architecture of the Binary Arithmetic Coder for UHDTV Video Encoders
IEEE Transactions on Circuits and Systems I: Regular Papers ( IF 5.1 ) Pub Date : 2020-03-01 , DOI: 10.1109/tcsi.2019.2949882
Grzegorz Pastuszak

Binary arithmetic coding is a key part of recent video compression standards. Its throughput is limited by the inherent dependencies existing in the algorithm. As a consequence, a higher bin parallelism leads to lower clock frequencies. This paper presents an architecture able to exceed limits existing in previous hardware implementations. The architecture exploits less probable symbols as starting points for long series of bins coded in one clock cycle. The evaluation for four possible cases of the range value allows its update in the pipeline before the delayed selection based on actual value. The adaptive division into series is proposed to make long series more frequent. To shorten critical paths, rMPS variables computed for symbols coded in the same clock cycle are first summed and then added to the low register. Up to 16 bypass-mode symbols can be processed in parallel with context-coded symbols in one clock cycle. The architecture is generative, i.e., its throughput can be scaled with resources without strict limits. For example, the binary arithmetic coder synthesized on 90nm TSMC technology which consumes 101.4k gates and operates at the 570 MHz has the average throughput of 13.4 bins per clock cycle for the high-quality H.265/HEVC compression.

中文翻译:

用于 UHDTV 视频编码器的二进制算术编码器的生成多符号架构

二进制算术编码是近期视频压缩标准的关键部分。它的吞吐量受到算法中存在的固有依赖性的限制。因此,更高的 bin 并行度会导致更低的时钟频率。本文提出了一种能够超越先前硬件实现中存在的限制的架构。该架构利用不太可能的符号作为在一个时钟周期内编码的长系列 bin 的起点。对范围值的四种可能情况的评估允许其在基于实际值的延迟选择之前在管道中更新。提出了自适应划分系列以使长系列更频繁。为了缩短关键路径,为在同一时钟周期内编码的符号计算的 rMPS 变量首先求和,然后添加到低位寄存器。在一个时钟周期内,最多可以处理 16 个旁路模式符号与上下文编码符号。该架构是生成式的,即其吞吐量可以随资源扩展而没有严格限制。例如,采用 90nm TSMC 技术合成的二进制算术编码器消耗 101.4k 门并在 570 MHz 下运行,对于高质量 H.265/HEVC 压缩,每个时钟周期的平均吞吐量为 13.4 个 bin。
更新日期:2020-03-01
down
wechat
bug