当前位置: X-MOL 学术IEEE Trans. Signal Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Forward-Projection Architecture for Fast Iterative Image Reconstruction in X-Ray CT
IEEE Transactions on Signal Processing ( IF 4.6 ) Pub Date : 2012-10-01 , DOI: 10.1109/tsp.2012.2208636
Jung Kuk Kim 1 , Jeffrey A Fessler , Zhengya Zhang
Affiliation  

Iterative image reconstruction can dramatically improve the image quality in X-ray computed tomography (CT), but the computation involves iterative steps of 3D forward- and back-projection, which impedes routine clinical use. To accelerate forward-projection, we analyze the CT geometry to identify the intrinsic parallelism and data access sequence for a highly parallel hardware architecture. To improve the efficiency of this architecture, we propose a water-filling buffer to remove pipeline stalls, and an out-of-order sectored processing to reduce the off-chip memory access by up to three orders of magnitude. We make a floating-point to fixed-point conversion based on numerical simulations and demonstrate comparable image quality at a much lower implementation cost. As a proof of concept, a 5-stage fully pipelined, 55-way parallel separable-footprint forward-projector is prototyped on a Xilinx Virtex-5 FPGA for a throughput of 925.8 million voxel projections/s at 200 MHz clock frequency, 4.6 times higher than an optimized 16-threaded program running on an 8-core 2.8-GHz CPU. A similar architecture can be applied to back-projection for a complete iterative image reconstruction system. The proposed algorithm and architecture can also be applied to hardware platforms such as graphics processing unit and digital signal processor to achieve significant accelerations.

中文翻译:

用于 X 射线 CT 中快速迭代图像重建的前向投影架构

迭代图像重建可以显着提高 X 射线计算机断层扫描 (CT) 的图像质量,但计算涉及 3D 正向和反向投影的迭代步骤,这阻碍了常规临床使用。为了加速前向投影,我们分析了 CT 几何结构,以确定高度并行硬件架构的内在并行性和数据访问顺序。为了提高这种架构的效率,我们提出了一种注水缓冲区来消除管道停顿,并提出一种乱序扇区处理,以将片外存储器访问减少多达三个数量级。我们基于数值模拟进行浮点到定点转换,并以低得多的实施成本展示了可比较的图像质量。作为概念证明,一个 5 阶段完全流水线,55 路并行可分离封装前向投影仪在 Xilinx Virtex-5 FPGA 上进行原型设计,在 200 MHz 时钟频率下的吞吐量为 9.258 亿体素投影/秒,是运行在 8-线程上的优化 16 线程程序的 4.6 倍核心 2.8 GHz CPU。类似的架构可以应用于完整的迭代图像重建系统的反投影。所提出的算法和架构也可以应用于图形处理单元和数字信号处理器等硬件平台,以实现显着的加速。类似的架构可以应用于完整的迭代图像重建系统的反投影。所提出的算法和架构也可以应用于图形处理单元和数字信号处理器等硬件平台,以实现显着的加速。类似的架构可以应用于完整的迭代图像重建系统的反投影。所提出的算法和架构也可以应用于图形处理单元和数字信号处理器等硬件平台,以实现显着的加速。
更新日期:2012-10-01
down
wechat
bug