当前位置: X-MOL 学术Int. J. Comput. Vis. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Unsupervised Point Cloud Representation Learning by Clustering and Neural Rendering
International Journal of Computer Vision ( IF 19.5 ) Pub Date : 2024-03-08 , DOI: 10.1007/s11263-024-02027-5
Guofeng Mei , Cristiano Saltori , Elisa Ricci , Nicu Sebe , Qiang Wu , Jian Zhang , Fabio Poiesi

Data augmentation has contributed to the rapid advancement of unsupervised learning on 3D point clouds. However, we argue that data augmentation is not ideal, as it requires a careful application-dependent selection of the types of augmentations to be performed, thus potentially biasing the information learned by the network during self-training. Moreover, several unsupervised methods only focus on uni-modal information, thus potentially introducing challenges in the case of sparse and textureless point clouds. To address these issues, we propose an augmentation-free unsupervised approach for point clouds, named CluRender, to learn transferable point-level features by leveraging uni-modal information for soft clustering and cross-modal information for neural rendering. Soft clustering enables self-training through a pseudo-label prediction task, where the affiliation of points to their clusters is used as a proxy under the constraint that these pseudo-labels divide the point cloud into approximate equal partitions. This allows us to formulate a clustering loss to minimize the standard cross-entropy between pseudo and predicted labels. Neural rendering generates photorealistic renderings from various viewpoints to transfer photometric cues from 2D images to the features. The consistency between rendered and real images is then measured to form a fitting loss, combined with the cross-entropy loss to self-train networks. Experiments on downstream applications, including 3D object detection, semantic segmentation, classification, part segmentation, and few-shot learning, demonstrate the effectiveness of our framework in outperforming state-of-the-art techniques.



中文翻译:

通过聚类和神经渲染进行无监督点云表示学习

数据增强促进了 3D 点云无监督学习的快速发展。然而,我们认为数据增强并不理想,因为它需要根据应用程序仔细选择要执行的增强类型,从而可能使网络在自我训练期间学到的信息产生偏差。此外,一些无监督方法仅关注单模态信息,因此在稀疏和无纹理点云的情况下可能会带来挑战。为了解决这些问题,我们提出了一种名为 CluRender 的无增强无监督点云方法,通过利用用于软聚类的单模态信息和用于神经渲染的跨模态信息来学习可转移的点级特征。软聚类通过伪标签预测任务实现自我训练,其中点与其聚类的隶属关系在这些伪标签将点云划分为近似相等分区的约束下用作代理。这使我们能够制定聚类损失,以最小化伪标签和预测标签之间的标准交叉熵。神经渲染从不同的角度生成逼真的渲染,将光度线索从 2D 图像传输到特征。然后测量渲染图像和真实图像之间的一致性以形成拟合损失,并结合自训练网络的交叉熵损失。对下游应用程序(包括 3D 对象检测、语义分割、分类、部分分割和少样本学习)的实验证明了我们的框架在超越最先进技术方面的有效性。

更新日期:2024-03-08
down
wechat
bug