GMMSP on GPU,Journal of Real-Time Image Processing

当前位置： X-MOL 学术 › J. Real-Time Image Proc. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

GMMSP on GPU
Journal of Real-Time Image Processing ( IF 3 ) Pub Date : 2018-03-17 , DOI: 10.1007/s11554-018-0762-3
Zhihua Ban , Jianguo Liu , Jeremy Fouriaux

Superpixel segmentation is a fundamental task in computer vision. Existing works contribute to superpixel segmentation either by improving segmentation accuracy or by reducing execution time. The former modifies existing models or develops new models to improve accuracy. The latter accelerates existing implementations or reduces algorithm complexity to improve execution rate. This work falls into the second category. Recently, a superpixel algorithm using Gaussian mixture model (GMMSP) achieves state-of-the-art performance in accuracy. After exploring this algorithm, we reached new conclusions on GMMSP that unlock potential concerning fine-grain parallelism implementation. We implement GMMSP with CUDA and make it run on GPUs. Experiments are conducted to validate the consistency between CPU and GPU implementations and to evaluate the performance of our implementation with respect to a serial and an OpenMP implementation. When we consider a full implementation with a postprocessing step executed on CPU to guarantee connectivity constraint, the proposed implementation achieves a speedup of 21× compared to the OpenMP implementation for images of size 240 × 320, using NVIDIA GTX 1080. It is also mentionable that we achieve a performance of over 1000 FPS on GTX 1080 (speedup of 77× compared to the OpenMP implementation) if the connectivity constraint is not included.

中文翻译：

GPU上的GMMSP

超像素分割是计算机视觉中的基本任务。现有作品通过提高分割精度或减少执行时间来促进超像素分割。前者修改现有模型或开发新模型以提高准确性。后者加快了现有实现的速度或降低了算法复杂度，从而提高了执行速度。这项工作属于第二类。最近，使用高斯混合模型（GMMSP）的超像素算法在准确性方面达到了最新水平。在探索了该算法之后，我们在GMMSP上得出了新的结论，这些结论释放了有关细粒度并行性实现的潜力。我们使用CUDA实现GMMSP，并使其在GPU上运行。进行实验以验证CPU和GPU实施之间的一致性，并评估我们的实施相对于串行和OpenMP实施的性能。当我们考虑使用在CPU上执行后处理步骤以确保连接性约束的完整实现时，与使用NVIDIA GTX 1080的尺寸为240×320的图像的OpenMP实现相比，所建议的实现实现了21倍的加速。如果不包括连接性限制，我们将在GTX 1080上实现超过1000 FPS的性能（与OpenMP实施相比，提高了77倍）。

更新日期：2018-03-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>