当前位置: X-MOL 学术Pattern Recogn. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GPU based parallel optimization for real time panoramic video stitching
Pattern Recognition Letters ( IF 5.1 ) Pub Date : 2019-06-27 , DOI: 10.1016/j.patrec.2019.06.018
Chengyao Du , Jingling Yuan , Jiansheng Dong , Lin Li , Mincheng Chen , Tao Li

Panoramic video is a sort of video recorded at the same point of view to record the full scene. With the development of video surveillance and the requirement for 3D converged video surveillance in smart cities, CPU and GPU are required to possess strong processing abilities to make panoramic video. The traditional panoramic products depend on post processing, which results in high power consumption, low stability and unsatisfying performance in real time. In order to solve these problems, we propose a real-time panoramic video stitching framework. The framework we propose mainly consists of three algorithms, L-ORB image feature extraction algorithm, feature point matching algorithm based on LSH and GPU parallel video stitching algorithm based on CUDA. The experiment results show that the algorithm mentioned can improve the performance in the stages of feature extraction of images stitching and matching, the running speed of which is 11.3 times than that of the traditional ORB algorithm and 641 times than that of the traditional SIFT algorithm. Based on analyzing the GPU resources occupancy rate of each resolution image stitching, we further propose a stream parallel strategy to maximize the utilization of GPU resources. Compared with the L-ORB algorithm, the efficiency of this strategy is improved by 1.6–2.5 times, and it can make full use of GPU resources. The performance of the system accomplished in the paper is 29.2 times than that of the former embedded one, while the power dissipation is reduced to 10 W.



中文翻译:

基于GPU的并行优化可实现实时全景视频拼接

全景视频是在同一视点录制的一种视频,用于录制整个场景。随着视频监控的发展以及智能城市对3D融合视频监控的要求,要求CPU和GPU具有强大的处理能力来制作全景视频。传统的全景产品依赖于后处理,这导致高功耗,低稳定性和实时性不令人满意。为了解决这些问题,我们提出了一种实时全景视频拼接框架。我们提出的框架主要由三种算法组成,即L-ORB图像特征提取算法,基于LSH的特征点匹配算法和基于CUDA的GPU并行视频拼接算法。实验结果表明,该算法在图像拼接和匹配的特征提取阶段可以提高性能,其运行速度是传统ORB算法的11.3倍,是传统SIFT算法的641倍。在分析每个分辨率图像拼接的GPU资源占用率的基础上,我们进一步提出了一种流并行策略,以最大程度地利用GPU资源。与L-ORB算法相比,该策略的效率提高了1.6-2.5倍,并且可以充分利用GPU资源。本文完成的系统性能是前嵌入式系统的29.2倍,而功耗则降低到10W。它的运行速度是传统ORB算法的11.3倍,是传统SIFT算法的641倍。在分析每个分辨率图像拼接的GPU资源占用率的基础上,我们进一步提出了一种流并行策略,以最大程度地利用GPU资源。与L-ORB算法相比,该策略的效率提高了1.6-2.5倍,并且可以充分利用GPU资源。本文完成的系统性能是前嵌入式系统的29.2倍,而功耗则降低到10W。它的运行速度是传统ORB算法的11.3倍,是传统SIFT算法的641倍。在分析每个分辨率图像拼接的GPU资源占用率的基础上,我们进一步提出了一种流并行策略,以最大程度地利用GPU资源。与L-ORB算法相比,该策略的效率提高了1.6-2.5倍,并且可以充分利用GPU资源。本文完成的系统性能是前嵌入式系统的29.2倍,而功耗则降低到10W。我们进一步提出了一种流并行策略,以最大程度地利用GPU资源。与L-ORB算法相比,该策略的效率提高了1.6–2.5倍,并且可以充分利用GPU资源。本文完成的系统性能是前嵌入式系统的29.2倍,而功耗则降低到10W。我们进一步提出了一种流并行策略,以最大程度地利用GPU资源。与L-ORB算法相比,该策略的效率提高了1.6–2.5倍,并且可以充分利用GPU资源。本文完成的系统性能是前嵌入式系统的29.2倍,而功耗则降低到10W。

更新日期:2020-03-07
down
wechat
bug