当前位置: X-MOL 学术IEEE Access › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Indoor 3D Semantic Robot VSLAM Based on Mask Regional Convolutional Neural Network
IEEE Access ( IF 3.9 ) Pub Date : 2020-01-01 , DOI: 10.1109/access.2020.2981648
Chongben Tao , Zhen Gao , Jinli Yan , Chunguang Li , Guozeng Cui

During the construction of indoor environmental semantic maps by robot Vision SLAM (VSLAM), there exist some problems such as low label classification accuracy and low precision under the situation of sparse feature points. In this case, this paper proposes an indoor three-dimensional semantic VSLAM algorithm based on Mask Regional Convolutional Neural Network (RCNN). Firstly, an Oriented FAST and a Rotated BRIEF (ORB) algorithms are used to extract image feature points. Secondly, a Random Sample Consensus (RANSAC) algorithm is employed to eliminate mismatched points and estimate camera position-pose changes. Then, a Mask RCNN algorithm is applied to make partial adjustments to its hyper parameter. A self-made data set is used to transfer learning, fulfilling real-time target detection and instance segmentation of a scene. A three-dimensional semantic map is constructed in combination with VSLAM algorithm. The semantic information in the environment not only improves the accuracy of VSLAM construction and positioning, but also reduces the impact of object movement on the construction by marking movable objects. Meanwhile, the VSLAM algorithm is used to calculate the positional constraints between objects and improve the accuracy of semantic understanding. Finally, by comparing with other methods, it demonstrates that this method is more correct and effective. It was also verified that the proposed method can accurately interpret the semantic information in environment for the construction of three-dimensional semantic maps.

中文翻译:

基于Mask区域卷积神经网络的室内3D语义机器人VSLAM

在机器人视觉SLAM(VSLAM)构建室内环境语义地图过程中,在特征点稀疏的情况下,存在标签分类准确率低、精度低等问题。针对这种情况,本文提出了一种基于Mask区域卷积神经网络(RCNN)的室内三维语义VSLAM算法。首先,使用Oriented FAST 和Rotated Brief (ORB) 算法来提取图像特征点。其次,采用随机样本共识(RANSAC)算法来消除不匹配点并估计相机位置姿势变化。然后,应用 Mask RCNN 算法对其超参数进行部分调整。使用自制的数据集进行迁移学习,实现场景的实时目标检测和实例分割。结合VSLAM算法构建三维语义图。环境中的语义信息不仅提高了VSLAM构建和定位的准确性,而且通过标记可移动物体来降低物体运动对构建的影响。同时,利用VSLAM算法计算对象之间的位置约束,提高语义理解的准确性。最后,通过与其他方法的比较,证明该方法更加正确和有效。也验证了所提出的方法可以准确地解释环境中的语义信息,用于构建三维语义地图。环境中的语义信息不仅提高了VSLAM构建和定位的准确性,而且通过标记可移动物体来降低物体运动对构建的影响。同时,利用VSLAM算法计算对象之间的位置约束,提高语义理解的准确性。最后,通过与其他方法的比较,证明该方法更加正确和有效。也验证了所提出的方法可以准确地解释环境中的语义信息,用于构建三维语义地图。环境中的语义信息不仅提高了VSLAM构建和定位的准确性,而且通过标记可移动物体来降低物体运动对构建的影响。同时,利用VSLAM算法计算对象之间的位置约束,提高语义理解的准确性。最后,通过与其他方法的比较,证明该方法更加正确和有效。也验证了所提出的方法可以准确地解释环境中的语义信息,用于构建三维语义地图。VSLAM算法用于计算对象之间的位置约束,提高语义理解的准确性。最后,通过与其他方法的比较,证明该方法更加正确和有效。也验证了所提出的方法可以准确地解释环境中的语义信息,用于构建三维语义地图。VSLAM算法用于计算对象之间的位置约束,提高语义理解的准确性。最后,通过与其他方法的比较,证明该方法更加正确和有效。也验证了所提出的方法可以准确地解释环境中的语义信息,用于构建三维语义地图。
更新日期:2020-01-01
down
wechat
bug