当前位置: X-MOL 学术IEEE Trans. Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Second-order Spectral Transform Block for 3D Shape Classification and Retrieval.
IEEE Transactions on Image Processing ( IF 10.6 ) Pub Date : 2020-01-23 , DOI: 10.1109/tip.2020.2967579
Ruixuan Yu , Jian Sun , Huibin Li

In this paper, we propose a novel network block, dubbed as second-order spectral transform block, for 3D shape retrieval and classification. This network block generalizes the second-order pooling to 3D surface by designing a learnable non-linear transform on the spectrum of the pooled descriptor. The proposed block consists of following two components. First, the second-order average (SO-Avr) and max-pooling (SOMax) operations are designed on 3D surface to aggregate local descriptors, which are shown to be more discriminative than the popular average-pooling or max-pooling. Second, a learnable spectral transform parameterized by mixture of power function is proposed to perform non-linear feature mapping in the space of pooled descriptors, i.e., manifold of symmetric positive definite matrix for SO-Avr, and space of symmetric matrix for SOMax. The proposed block can be plugged into existing network architectures to aggregate local shape descriptors for boosting their performance. We apply it to a shallow network for nonrigid 3D shape analysis and to existing networks for rigid shape analysis, where it improves the first-tier retrieval accuracy by 7.2% on SHREC'14 Real dataset and achieves state-of-the-art classification accuracy on ModelNet40. As an extension, we apply our block to 2D image classification, showing its superiority compared with traditional second-order pooling methods. We also provide theoretical and experimental analysis on stability of the proposed second-order spectral transform block.

中文翻译:

用于3D形状分类和检索的二阶光谱变换块。

在本文中,我们提出了一种新颖的网络块,称为二阶频谱变换块,用于3D形状检索和分类。该网络模块通过在池化描述符的频谱上设计可学习的非线性变换,将二阶池化推广到3D表面。建议的块包含以下两个组件。首先,在3D曲面上设计了二阶平均(SO-Avr)和最大合并(SOMax)操作以聚合局部描述符,与本地流行的平均合并或最大合并相比,它们具有更大的判别力。其次,提出了一种将幂函数混合参数化的可学谱变换,以在混合描述符的空间内进行非线性特征映射,即,对于SO-Avr,对称正定矩阵的流形;对于SOMax,对称矩阵的空间。可以将建议的模块插入现有的网络体系结构中,以聚合本地形状描述符以提高其性能。我们将其应用于浅层网络(用于非刚性3D形状分析)和现有网络(用于刚性形状分析),从而在SHREC'14 Real数据集上将第一层检索精度提高了7.2%,并实现了最新的分类精度在ModelNet40上。作为扩展,我们将块应用于2D图像分类,显示了其与传统二阶合并方法相比的优越性。我们还提供了对所提出的二阶频谱变换块的稳定性的理论和实验分析。我们将其应用于浅层网络(用于非刚性3D形状分析)和现有网络(用于刚性形状分析),从而在SHREC'14 Real数据集上将第一层检索精度提高了7.2%,并实现了最新的分类精度在ModelNet40上。作为扩展,我们将块应用于2D图像分类,显示了其与传统二阶合并方法相比的优越性。我们还提供了对所提出的二阶频谱变换块的稳定性的理论和实验分析。我们将其应用于浅层网络(用于非刚性3D形状分析)和现有网络(用于刚性形状分析),从而在SHREC'14 Real数据集上将第一层检索精度提高了7.2%,并实现了最新的分类精度在ModelNet40上。作为扩展,我们将块应用于2D图像分类,显示了其与传统二阶合并方法相比的优越性。我们还提供了对所提出的二阶频谱变换块的稳定性的理论和实验分析。展示了其与传统二阶合并方法相比的优越性。我们还提供了对所提出的二阶频谱变换块的稳定性的理论和实验分析。展示了其与传统二阶合并方法相比的优越性。我们还提供了对所提出的二阶频谱变换块的稳定性的理论和实验分析。
更新日期:2020-04-22
down
wechat
bug