Journal of Signal Processing Systems ( IF 1.8 ) Pub Date : 2019-09-04 , DOI: 10.1007/s11265-019-01466-5 Pavel Arnaudov , Tokunbo Ogunfunmi
This paper presents a new Machine Learning based approach to video Fast Motion Estimation, which improves quality, minimizes power consumption and provides control over the performance vs power balance, rendering it very suitable for hardware implementation into a motion co-processor. Many mobile and hand-held devices today deploy such hardware accelerators. The main goal of the presented algorithm is to achieve maximum quality Motion Estimation per unit of consumed power by minimizing the number of search points and also providing an optional mechanism for finding an optimal early termination point. The paper presents the creation of a dictionary of adaptively pre-learned fixed search patterns along with a pre-trained neural network to adaptively help select the most adequate search pattern from a dictionary according to the dynamics of the motion within a specific region of the video frame, not the frame or scene as a whole. There are often motions in various directions within the same scene or frame and the ability to focus on a local region within the frame improves quality significantly. Full Search represents the quality goal and upper boundary for any integer Fast Motion Estimation. The presented algorithm adds about 1 dB of PSNR to state-of-the-art fixed search patterns. There is only about 0.5 dB of PSNR remaining between our algorithm and Full Search.
中文翻译:
高清视频人工智能自适应搜索快速运动估计算法
本文提出了一种新的基于机器学习的视频快速运动估计方法,该方法可以提高质量,最小化功耗并提供性能与功率平衡的控制,使其非常适合在硬件中实现运动协处理器。如今,许多移动和手持设备都部署了此类硬件加速器。提出的算法的主要目的是通过最小化搜索点的数量来实现每单位消耗功率的最高质量运动估计,并且还提供一种用于寻找最佳早期终止点的可选机制。本文介绍了如何创建自适应预学习的固定搜索模式字典以及预训练的神经网络,以根据视频特定区域内的运动动态自适应地帮助从字典中选择最合适的搜索模式框架,而不是整个框架或场景。同一场景或同一帧中经常有不同方向的运动,并且聚焦于帧内局部区域的能力可以显着提高质量。完全搜索代表任何整数快速运动估计的质量目标和上限。所提出的算法为最新的固定搜索模式增加了大约1 dB的PSNR。在我们的算法和完全搜索之间仅剩下约0.5 dB的PSNR。同一场景或同一帧中经常有朝着不同方向的运动,并且专注于帧中局部区域的能力可以显着提高质量。完全搜索代表任何整数快速运动估计的质量目标和上限。所提出的算法为最新的固定搜索模式增加了大约1 dB的PSNR。在我们的算法和完全搜索之间仅剩下约0.5 dB的PSNR。同一场景或同一帧中经常有不同方向的运动,并且聚焦于帧内局部区域的能力可以显着提高质量。完全搜索代表任何整数快速运动估计的质量目标和上限。所提出的算法为最新的固定搜索模式增加了大约1 dB的PSNR。在我们的算法和完全搜索之间仅剩下约0.5 dB的PSNR。