当前位置:
X-MOL 学术
›
IEEE Trans. Robot.
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Empty Cities: A Dynamic-Object-Invariant Space for Visual SLAM
IEEE Transactions on Robotics ( IF 7.8 ) Pub Date : 2020-01-01 , DOI: 10.1109/tro.2020.3031267 Berta Bescos , Cesar Cadena , Jose Neira
IEEE Transactions on Robotics ( IF 7.8 ) Pub Date : 2020-01-01 , DOI: 10.1109/tro.2020.3031267 Berta Bescos , Cesar Cadena , Jose Neira
In this paper we present a data-driven approach to obtain the static image of a scene, eliminating dynamic objects that might have been present at the time of traversing the scene with a camera. The general objective is to improve vision-based localization and mapping tasks in dynamic environments, where the presence (or absence) of different dynamic objects in different moments makes these tasks less robust. We introduce an end-to-end deep learning framework to turn images of an urban environment that include dynamic content, such as vehicles or pedestrians, into realistic static frames suitable for localization and mapping. This objective faces two main challenges: detecting the dynamic objects, and inpainting the static occluded back-ground. The first challenge is addressed by the use of a convolutional network that learns a multi-class semantic segmentation of the image. The second challenge is approached with a generative adversarial model that, taking as input the original dynamic image and the computed dynamic/static binary mask, is capable of generating the final static image. This framework makes use of two new losses, one based on image steganalysis techniques, useful to improve the inpainting quality, and another one based on ORB features, designed to enhance feature matching between real and hallucinated image regions. To validate our approach, we perform an extensive evaluation on different tasks that are affected by dynamic entities, i.e., visual odometry, place recognition and multi-view stereo, with the hallucinated images. Code has been made available on this https URL.
中文翻译:
空城:视觉 SLAM 的动态对象不变空间
在本文中,我们提出了一种数据驱动的方法来获取场景的静态图像,消除使用相机遍历场景时可能存在的动态对象。总体目标是改进动态环境中基于视觉的定位和映射任务,其中不同时刻不同动态对象的存在(或不存在)会使这些任务的鲁棒性降低。我们引入了端到端的深度学习框架,将包含动态内容(例如车辆或行人)的城市环境图像转换为适合定位和映射的逼真静态帧。该目标面临两个主要挑战:检测动态对象和修复静态遮挡的背景。第一个挑战是通过使用卷积网络来解决,该网络学习图像的多类语义分割。第二个挑战是通过生成对抗模型来解决,该模型将原始动态图像和计算出的动态/静态二进制掩码作为输入,能够生成最终的静态图像。该框架利用了两种新的损失,一种基于图像隐写分析技术,有助于提高修复质量,另一种基于 ORB 特征,旨在增强真实和幻觉图像区域之间的特征匹配。为了验证我们的方法,我们对受动态实体影响的不同任务进行了广泛的评估,即视觉里程计、位置识别和多视图立体,以及幻觉图像。代码已在此 https URL 上提供。
更新日期:2020-01-01
中文翻译:
空城:视觉 SLAM 的动态对象不变空间
在本文中,我们提出了一种数据驱动的方法来获取场景的静态图像,消除使用相机遍历场景时可能存在的动态对象。总体目标是改进动态环境中基于视觉的定位和映射任务,其中不同时刻不同动态对象的存在(或不存在)会使这些任务的鲁棒性降低。我们引入了端到端的深度学习框架,将包含动态内容(例如车辆或行人)的城市环境图像转换为适合定位和映射的逼真静态帧。该目标面临两个主要挑战:检测动态对象和修复静态遮挡的背景。第一个挑战是通过使用卷积网络来解决,该网络学习图像的多类语义分割。第二个挑战是通过生成对抗模型来解决,该模型将原始动态图像和计算出的动态/静态二进制掩码作为输入,能够生成最终的静态图像。该框架利用了两种新的损失,一种基于图像隐写分析技术,有助于提高修复质量,另一种基于 ORB 特征,旨在增强真实和幻觉图像区域之间的特征匹配。为了验证我们的方法,我们对受动态实体影响的不同任务进行了广泛的评估,即视觉里程计、位置识别和多视图立体,以及幻觉图像。代码已在此 https URL 上提供。