当前位置: X-MOL 学术Cognit. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cognitive Template-Clustering Improved LineMod for Efficient Multi-object Pose Estimation
Cognitive Computation ( IF 5.4 ) Pub Date : 2020-03-17 , DOI: 10.1007/s12559-020-09717-5
Tielin Zhang , Yang Yang , Yi Zeng , Yuxuan Zhao

Various types of theoretical algorithms have been proposed for 6D pose estimation, e.g., the point pair method, template matching method, Hough forest method, and deep learning method. However, they are still far from the performance of our natural biological systems, which can undertake 6D pose estimation of multi-objects efficiently, especially with severe occlusion. With the inspiration of the Müller-Lyer illusion in the biological visual system, in this paper, we propose a cognitive template-clustering improved LineMod (CT-LineMod) model. The model uses a 7D cognitive feature vector to replace standard 3D spatial points in the clustering procedure of Patch-LineMod, in which the cognitive distance of different 3D spatial points will be further influenced by the additional 4D information related with direction and magnitude of features in the Müller-Lyer illusion. The 7D vector will be dimensionally reduced into the 3D vector by the gradient-descent method, and then further clustered by K-means to aggregately match templates and automatically eliminate superfluous clusters, which makes the template matching possible on both holistic and part-based scales. The model has been verified on the standard Doumanoglou dataset and demonstrates a state-of-the-art performance, which shows the accuracy and efficiency of the proposed model on cognitive feature distance measurement and template selection on multiple pose estimation under severe occlusion. The powerful feature representation in the biological visual system also includes characteristics of the Müller-Lyer illusion, which, to some extent, will provide guidance towards a biologically plausible algorithm for efficient 6D pose estimation under severe occlusion.

中文翻译:

认知模板聚类改进LineMod用于有效的多目标姿势估计

已经提出了用于6D姿势估计的各种类型的理论算法,例如点对方法,模板匹配方法,霍夫森林方法和深度学习方法。但是,它们离我们的自然生物系统的性能还差得很远,后者可以有效地进行多对象的6D姿态估计,尤其是在严重遮挡的情况下。借助生物视觉系统中Müller-Lyer幻觉的启发,本文提出了一种认知模板聚类的改进LineMod(CT-LineMod)模型。该模型使用7D认知特征向量替换Patch-LineMod的聚类过程中的标准3D空间点,其中,与Müller-Lyer幻觉中的特征的方向和大小有关的其他4D信息将进一步影响不同3D空间点的认知距离。7D向量将通过梯度下降方法在尺寸上缩小为3D向量,然后通过K均值进一步聚类以聚合模板并自动消除多余的聚类,这使得模板在整体和基于零件的尺度上均可匹配。该模型已在标准Doumanoglou数据集中进行了验证,并展示了最新的性能,该模型显示了该模型在严重遮挡下认知特征距离测量和多姿势估计模板选择上的准确性和有效性。
更新日期:2020-03-17
down
wechat
bug