当前位置: X-MOL 学术IEEE Trans. Circuits Syst. I Regul. Pap. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Ultralow-Latency VLSI Architecture Based on a Linear Approximation Method for Computing Nth Roots of Floating-Point Numbers
IEEE Transactions on Circuits and Systems I: Regular Papers ( IF 5.2 ) Pub Date : 2020-12-01 , DOI: 10.1109/tcsi.2020.3038417
Fei Lyu , Xiaoqi Xu , Yu Wang , Yuanyong Luo , Yuxuan Wang , Hongbing Pan

State-of-the-art approaches that perform root computations based on the COordinate Rotation Digital Computer (CORDIC) algorithm suffer from high latency in performing multiple iterations. Therefore, root computations based on the CORDIC algorithm cannot meet the strict latency requirements of some applications. In this paper, we propose a methodology for performing Nth root computations on floating-point numbers based on the piecewise linear (PWL) approximation method. The proposed method divides an Nth root computation into several subtasks approximated by the PWL algorithm. It determines the widest segments of the subtasks and the smallest fractional width needed to satisfy the predefined maximum relative error Max_Errr. Our design is coded in Verilog HDL and synthesized under TSMC 40 nm CMOS technology. The synthesized results show that our design can reach the highest frequency of 2.703 GHz with an area consumption of 2608.84 μ m2 and a power consumption of 2.4476 mW. Compared with one stateof-the-art architecture, our design saves 91.60%, 89.84%, and 63.33% of the area, power, and latency @1.89GHz frequency, respectively, while reducing Max_Errr by 57.30%. In addition, it saves 94.52%, 92.68%, and 73.17% of the area, power, and delay @1.89GHz frequency, respectively, and reduces Max_Errr by 1.65% when compared with the other state-of-the-art design.

中文翻译:


基于线性逼近法计算浮点数 N 次方根的超低延迟 VLSI 架构



基于坐标旋转数字计算机 (CORDIC) 算法执行根计算的最先进方法在执行多次迭代时存在高延迟问题。因此,基于CORDIC算法的根计算无法满足某些应用严格的延迟要求。在本文中,我们提出了一种基于分段线性(PWL)近似方法对浮点数执行 N 次根计算的方法。该方法将 N 次根计算划分为几个由 PWL 算法近似的子任务。它确定子任务的最宽段以及满足预定义的最大相对误差 Max_Errr 所需的最小分数宽度。我们的设计采用 Verilog HDL 进行编码,并在 TSMC 40 nm CMOS 技术下综合。综合结果表明,我们的设计可以达到最高频率2.703 GHz,面积消耗为2608.84 μ m2,功耗为2.4476 mW。与一种最先进的架构相比,我们的设计在 1.89GHz 频率下分别节省了 91.60%、89.84% 和 63.33% 的面积、功耗和延迟,同时将 Max_Errr 降低了 57.30%。此外,与其他最先进的设计相比,它在 1.89GHz 频率下分别节省了 94.52%、92.68% 和 73.17% 的面积、功耗和延迟,并将 Max_Errr 降低了 1.65%。
更新日期:2020-12-01
down
wechat
bug