Sampling Based Scene-Space Video Processing,arXiv - CS - Graphics

当前位置： X-MOL 学术 › arXiv.cs.GR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Sampling Based Scene-Space Video Processing
arXiv - CS - Graphics Pub Date : 2021-02-05 , DOI: arxiv-2102.03011
Felix Klose, Oliver Wang, Jean-Charles Bazin, Marcus Magnor, Alexander Sorkine-Hornung

Many compelling video processing effects can be achieved if per-pixel depth information and 3D camera calibrations are known. However, the success of such methods is highly dependent on the accuracy of this "scene-space" information. We present a novel, sampling-based framework for processing video that enables high-quality scene-space video effects in the presence of inevitable errors in depth and camera pose estimation. Instead of trying to improve the explicit 3D scene representation, the key idea of our method is to exploit the high redundancy of approximate scene information that arises due to most scene points being visible multiple times across many frames of video. Based on this observation, we propose a novel pixel gathering and filtering approach. The gathering step is general and collects pixel samples in scene-space, while the filtering step is application-specific and computes a desired output video from the gathered sample sets. Our approach is easily parallelizable and has been implemented on GPU, allowing us to take full advantage of large volumes of video data and facilitating practical runtimes on HD video using a standard desktop computer. Our generic scene-space formulation is able to comprehensively describe a multitude of video processing applications such as denoising, deblurring, super resolution, object removal, computational shutter functions, and other scene-space camera effects. We present results for various casually captured, hand-held, moving, compressed, monocular videos depicting challenging scenes recorded in uncontrolled environments.

中文翻译：

基于采样的场景空间视频处理

如果已知每个像素的深度信息和3D摄像机校准，则可以实现许多引人注目的视频处理效果。但是，这种方法的成功高度依赖于这种“场景空间”信息的准确性。我们提出了一种新颖的，基于采样的处理视频的框架，该框架在出现不可避免的深度误差和相机姿态估计误差的情况下，可以提供高质量的场景空间视频效果。替代尝试改善显式3D场景表示的方法，我们方法的关键思想是利用由于大多数场景点在视频的许多帧中多次可见而产生的近似场景信息的高度冗余。基于此观察，我们提出了一种新颖的像素采集和滤波方法。收集步骤很一般，它会收集场景空间中的像素样本，而过滤步骤是特定于应用程序的，并从收集的样本集中计算出所需的输出视频。我们的方法易于并行化，并已在GPU上实现，从而使我们能够充分利用大量视频数据，并使用标准台式计算机促进高清视频的实际运行时间。我们通用的场景空间公式能够全面描述多种视频处理应用程序，例如降噪，去模糊，超分辨率，对象去除，计算快门功能以及其他场景空间相机效果。我们提供了各种随意捕获，手持，移动，压缩，单眼视频的结果，这些视频描述了在不受控制的环境中记录的具有挑战性的场景。

更新日期：2021-02-08

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>