当前位置: X-MOL 学术J. Real-Time Image Proc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CPU and GPU real-time filtering methods for dense surface metrology using general matrix to matrix multiplications
Journal of Real-Time Image Processing ( IF 3 ) Pub Date : 2022-02-18 , DOI: 10.1007/s11554-022-01204-4
R. Usamentiaga 1
Affiliation  

Filtering is a required task in surface metrology for the identification of the components relevant for automated quality control. The calculation of real-time features about the surface is crucial to determining the mechanical and physical properties of the inspected product. The computation efficiency of the filtering operations is a major challenge in surface metrology, as current sensors provide massive volumes of data at very high acquisition rates. To overcome the challenges, this work presents different real-time filtering solutions comparing the performance on the CPU and on the GPU, using modern hardware. The proposed framework is focused on filtering techniques that can be expressed using a finite impulse response (FIR) kernel that includes the Gaussian kernel, the most common filtering technique recommended by ISO and ASME standards. This research work proposes variations of the double FIFO and double circular filters. The filters are transformed into a series of general matrix to matrix multiplications, which can be run extremely efficiently on different architectures. The proposed filtering approach provides superior performance compared with previous works. Additionally, tests are carried out to quantify the performance of the GPU in terms of data transfer and computation capabilities in order to diminish the penalty imposed by data transfer from main memory to the GPU in real-time operations. Based on the results, an efficient batch filtering technique is proposed that can be run on the GPU faster than the CPU even for small profile and kernel sizes, offloading this task from the host CPU for optimal system and application response.



中文翻译:

使用通用矩阵到矩阵乘法的密集表面计量的 CPU 和 GPU 实时滤波方法

过滤是表面计量学中的一项必需任务,用于识别与自动化质量控制相关的组件。表面实时特征的计算对于确定被检测产品的机械和物理特性至关重要。滤波操作的计算效率是表面计量学的主要挑战,因为当前传感器以非常高的采集率提供大量数据。为了克服这些挑战,这项工作提出了不同的实时过滤解决方案,比较了使用现代硬件在 CPU 和 GPU 上的性能。所提出的框架侧重于可以使用包括高斯内核的有限脉冲响应 (FIR) 内核来表示的滤波技术,这是 ISO 和 ASME 标准推荐的最常见的滤波技术。这项研究工作提出了双 FIFO 和双循环滤波器的变体。过滤器被转换为一系列通用矩阵到矩阵的乘法,可以在不同的架构上非常有效地运行。与以前的工作相比,所提出的过滤方法提供了优越的性能。此外,还进行了测试以量化 GPU 在数据传输和计算能力方面的性能,以减少在实时操作中从主存储器到 GPU 的数据传输所带来的损失。基于结果,提出了一种高效的批量过滤技术,即使对于小型配置文件和内核大小,它也可以比 CPU 更快地在 GPU 上运行,从而从主机 CPU 卸载此任务以获得最佳系统和应用程序响应。

更新日期:2022-02-21
down
wechat
bug