当前位置: X-MOL 学术Signal Process. Image Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Unsupervised disparity estimation from light field using plug-and-play weighted warping loss
Signal Processing: Image Communication ( IF 3.5 ) Pub Date : 2022-06-08 , DOI: 10.1016/j.image.2022.116764
Taisei Iwatsuki , Keita Takahashi , Toshiaki Fujii

We investigated disparity estimation from a light field using a convolutional neural network (CNN). Most of the methods implemented a supervised learning framework, where the predicted disparity map was compared directly to the corresponding ground-truth disparity map in the training stage. However, light field data accompanied with ground-truth disparity maps were insufficient and rarely available for real-world scenes. The lack of training data resulted in limited generality of the methods trained with them. To tackle this problem, we took a simple Figure-and-play approach to remake a supervised method into an unsupervised (self-supervised) one. We replaced the loss function of the original method with one that does not depend on the ground-truth disparity maps. More specifically, our loss function is designed to indirectly evaluate the accuracy of the disparity map by using warping errors among the input light field views. We designed pixel-wise weights to properly evaluate the warping errors in the presence of occlusions, and an edge loss to encourage edge alignment between the image and the disparity map. As a result of this unsupervised learning framework, our method can use more abundant training datasets (even those without ground-truth disparity maps) than the original supervised method. Our method was evaluated on computer-generated scenes (4D Light Field Benchmark) and real-world scenes captured by Lytro Illum cameras. Our method achieved the state-of-the-art performance as an unsupervised method on the benchmark. We also demonstrated that our method can estimate disparity maps more accurately than the original supervised method for various real-world scenes.



中文翻译:

使用即插即用加权扭曲损失从光场进行无监督视差估计

我们使用卷积神经网络 (CNN) 研究了来自光场的视差估计。大多数方法都实现了监督学习框架,其中预测的视差图在训练阶段直接与相应的真实视差图进行比较。然而,伴随真实视差图的光场数据是不够的,并且很少可用于现实世界的场景。缺乏训练数据导致用它们训练的方法的通用性有限。为了解决这个问题,我们采用了一种简单的“玩图游戏”方法,将监督方法改造成无监督(自我监督)方法。我们用不依赖于真实视差图的方法替换了原始方法的损失函数。进一步来说,我们的损失函数旨在通过使用输入光场视图之间的翘曲误差来间接评估视差图的准确性。我们设计了逐像素权重来正确评估存在遮挡时的翘曲误差,并设计了边缘损失以鼓励图像和视差图之间的边缘对齐。由于这种无监督学习框架,我们的方法可以使用比原始监督方法更丰富的训练数据集(即使是那些没有真实视差图的数据集)。我们的方法在计算机生成的场景(4D 光场基准)和 Lytro Illum 相机捕获的真实场景上进行了评估。我们的方法作为基准上的无监督方法实现了最先进的性能。

更新日期:2022-06-08
down
wechat
bug