Abstract
This article presents a novel approach to learn and detect distinctive regions on 3D shapes. Unlike previous works, which require labeled data, our method is unsupervised. We conduct the analysis on point sets sampled from 3D shapes, then formulate and train a deep neural network for an unsupervised shape clustering task to learn local and global features for distinguishing shapes with respect to a given shape set. To drive the network to learn in an unsupervised manner, we design a clustering-based nonparametric softmax classifier with an iterative re-clustering of shapes, and an adapted contrastive loss for enhancing the feature embedding quality and stabilizing the learning process. By then, we encourage the network to learn the point distinctiveness on the input shapes. We extensively evaluate various aspects of our approach and present its applications for distinctiveness-guided shape retrieval, sampling, and view selection in 3D scenes.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, Unsupervised Detection of Distinctive Regions on 3D Shapes
- Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). 265--283. https://www.usenix.org/system/files/conference/osdi16/osdi16-abadi.pdf.Google ScholarDigital Library
- Pierre Alliez, Mark Meyer, and Mathieu Desbrun. 2002. Interactive geometry remeshing. ACM Transactions on Graphics (SIGGRAPH) 21, 3 (2002), 347--354.Google ScholarDigital Library
- Marco Ancona, Enea Ceolini, Cengiz Oztireli, and Markus Gross. 2018. Towards better understanding of gradient-based attribution methods for deep neural networks. In International Conference on Learning Representations (ICLR).Google Scholar
- Yasuhiro Aoki, Hunter Goforth, Rangaprasad Arun Srivatsan, and Simon Lucey. 2019. PointNetLK: Robust 8 efficient point cloud registration using PointNet. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 7163--7172.Google ScholarCross Ref
- Philip Bachman, R. Devon Hjelm, and William Buchwalter. 2019. Learning representations by maximizing mutual information across views. In International Conference on Neural Information Processing Systems (NIPS). 15509--15519.Google Scholar
- Umberto Castellani, Marco Cristani, Simone Fantoni, and Vittorio Murino. 2008. Sparse points matching by combining 3D mesh saliency with statistical descriptors. Computer Graphics Forum (Eurographics) 27, 2 (2008), 643--652.Google ScholarCross Ref
- Xiaobai Chen, Abulhair Saparov, Bill Pang, and Thomas Funkhouser. 2012. Schelling points on 3D surface meshes. ACM Transactions on Graphics (SIGGRAPH) 31, 4 (2012), 29:1--29:12.Google ScholarDigital Library
- Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic, and Alexei Efros. 2012. What makes Paris look like Paris? ACM Transactions on Graphics (SIGGRAPH) 31, 4 (2012), 101:1--101:9.Google ScholarDigital Library
- Helin Dutagaci, Chun Pan Cheung, and Afzal Godil. 2012. Evaluation of 3D interest point detection techniques via human-generated ground truth. The Visual Computer 28, 9 (2012), 901--917.Google ScholarDigital Library
- Ran Gal and Daniel Cohen-Or. 2006. Salient geometric features for partial shape matching and similarity. ACM Transactions on Graphics 25, 1 (2006), 130--150.Google ScholarDigital Library
- Michael Garland and Paul S. Heckbert. 1997. Surface simplification using quadric error metrics. In Proceedings of SIGGRAPH. 209--216.Google ScholarDigital Library
- Raia Hadsell, Sumit Chopra, and Yann LeCun. 2006. Dimensionality reduction by learning an invariant mapping. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1735--1742.Google ScholarDigital Library
- Olivier J. Hénaff, Ali Razavi, Carl Doersch, S. M. Eslami, and Aaron van den Oord. 2019. Data-efficient image recognition with contrastive predictive coding. arXiv preprint arXiv:1905.09272 (2019).Google Scholar
- Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).Google Scholar
- R. Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Phil Bachman, Adam Trischler, and Yoshua Bengio. 2019. Learning deep representations by mutual information estimation and maximization. In International Conference on Learning Representations (ICLR).Google Scholar
- Binh-Son Hua, Minh-Khoi Tran, and Sai-Kit Yeung. 2018. Pointwise convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 984--993.Google ScholarCross Ref
- Mayank Juneja, Andrea Vedaldi, C. V. Jawahar, and Andrew Zisserman. 2013. Blocks that shout: Distinctive parts for scene classification. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 923--930.Google ScholarDigital Library
- Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Truc Le and Ye Duan. 2018. PointGrid: A deep network for 3D shape understanding. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 9204--9214.Google ScholarCross Ref
- Chang Ha Lee, Amitabh Varshney, and David W. Jacobs. 2005. Mesh saliency. ACM Transactions on Graphics (SIGGRAPH) 24, 3 (2005), 659--666.Google ScholarDigital Library
- George Leifman, Elizabeth Shtrom, and Ayellet Tal. 2012. Surface regions of interest for viewpoint selection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 414--421.Google ScholarCross Ref
- Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, and Baoquan Chen. 2018. PointCNN: Convolution on -transformed points. In International Conference on Neural Information Processing Systems (NIPS). 828--838.Google Scholar
- Yu Liu, Guanglu Song, Jing Shao, Xiao Jin, and Xiaogang Wang. 2018. Transductive centroid projection for semi-supervised large-scale recognition. In European Conference on Computer Vision (ECCV). 70--86.Google ScholarCross Ref
- Flora Ponjou Tasse, Jiri Kosinka, and Neil Dodgson. 2015. Cluster-based point set saliency. In IEEE International Conference on Computer Vision (ICCV). 163--171.Google ScholarDigital Library
- Charles R. Qi, Hao Su, Kaichun Mo, and Leonidas J. Guibas. 2017a. PointNet: Deep learning on point sets for 3D classification and segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 652--660.Google Scholar
- Charles R. Qi, Li Yi, Hao Su, and Leonidas J. Guibas. 2017b. PointNet++: Deep hierarchical feature learning on point sets in a metric space. In International Conference on Neural Information Processing Systems (NIPS). 5099--5108.Google Scholar
- Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In IEEE International Conference on Computer Vision (ICCV). 618--626.Google ScholarCross Ref
- Yiru Shen, Chen Feng, Yaoqing Yang, and Dong Tian. 2018. Mining point cloud local structures by kernel correlation and graph pooling. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4548--4557.Google ScholarCross Ref
- Philip Shilane and Thomas Funkhouser. 2006. Selecting distinctive 3D shape descriptors for similarity retrieval. In IEEE International Conference on Shape Modeling and Applications (SMI). 18:1--18:10.Google ScholarDigital Library
- Philip Shilane and Thomas Funkhouser. 2007. Distinctive regions of 3D surfaces. ACM Transactions on Graphics 26, 2 (2007), 7:1--7:15.Google ScholarDigital Library
- Philip Shilane, Patrick Min, Michael Kazhdan, and Thomas Funkhouser. 2004. The Princeton shape benchmark. In IEEE International Conference on Shape Modeling and Applications (SMI). 167--178.Google ScholarCross Ref
- Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. 2017. Learning important features through propagating activation differences. In Proceedings of International Conference on Machine Learning (ICML). 3145--3153.Google Scholar
- Elizabeth Shtrom, George Leifman, and Ayellet Tal. 2013. Saliency detection in large point sets. In IEEE International Conference on Computer Vision (ICCV). 3591--3598.Google ScholarDigital Library
- Zhenyu Shu, Shiqing Xin, Xin Xu, Ligang Liu, and Ladislav Kavan. 2019. Detecting 3D points of interest using multiple features and stacked auto-encoder. IEEE Transactions Visualization 8 Computer Graphics 25, 8 (2019), 2583--2596.Google ScholarCross Ref
- Saurabh Singh, Abhinav Gupta, and Alexei A. Efros. 2012. Unsupervised discovery of mid-level discriminative patches. In European Conference on Computer Vision (ECCV). 73--86.Google Scholar
- Ran Song, Yonghuai Liu, and Paul Rosin. 2018. Distinction of 3D objects and scenes via classification network and Markov random field. IEEE Transactions Visualization 8 Computer Graphics (2018), To appear.Google Scholar
- X. Yu Stella and Jianbo Shi. 2003. Multiclass spectral clustering. In IEEE International Conference on Computer Vision (ICCV). 313--320.Google Scholar
- Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, and Jan Kautz. 2018. SPLATNet: Sparse lattice networks for point cloud processing. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2530--2539.Google ScholarCross Ref
- Jian Sun and Jean Ponce. 2013. Learning discriminative part detectors for image classification and cosegmentation. In IEEE International Conference on Computer Vision (ICCV). 3400--3407.Google ScholarDigital Library
- Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic attribution for deep networks. In International Conference on Machine Learning (ICML). 3319--3328.Google Scholar
- Johan W. H. Tangelder and Remco C. Veltkamp. 2004. A survey of content based 3D shape retrieval methods. In IEEE International Conference on Shape Modeling and Applications (SMI). 145--156.Google Scholar
- Ulrike Von Luxburg. 2007. A tutorial on spectral clustering. Statistics and Computing 17, 4 (2007), 395--416.Google ScholarDigital Library
- Xi Wang, Sebastian Koch, Kenneth Holmqvist, and Marc Alexa. 2018. Tracking the gaze on objects in 3D: How do people really look at the Bunny? ACM Transactions on Graphics (SIGGRAPH Asia) 37, 6 (2018), 188:1--188:18.Google Scholar
- Yaming Wang, Jonghyun Choi, Vlad Morariu, and Larry S. Davis. 2016. Mining discriminative triplets of patches for fine-grained classification. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1163--1172.Google Scholar
- Yue Wang and Justin M. Solomon. 2019. Deep closest point: Learning representations for point cloud registration. In IEEE International Conference on Computer Vision (ICCV). 3523--3532.Google Scholar
- Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, and Justin M. Solomon. 2019. Dynamic graph CNN for learning on point clouds. ACM Transactions on Graphics 38, 5 (2019), 146:1--146:12.Google ScholarDigital Library
- 3D Warehouse. 2019. Retrieved January 2, 2019 from https://3dwarehouse.sketchup.com/.Google Scholar
- Yandong Wen, Kaipeng Zhang, Zhifeng Li, and Yu Qiao. 2016. A discriminative feature learning approach for deep face recognition. In European Conference on Computer Vision (ECCV). 499--515.Google ScholarCross Ref
- Sanghyun Woo, Jongchan Park, Joon-Young Lee, and In So Kweon. 2018. CBAM: Convolutional block attention module. In European Conference on Computer Vision (ECCV). 3--19.Google ScholarDigital Library
- Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3D ShapeNets: A deep representation for volumetric shapes. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1912--1920.Google Scholar
- Zhirong Wu, Yuanjun Xiong, X. Yu Stella, and Dahua Lin. 2018. Unsupervised feature learning via non-parametric instance discrimination. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3733--3742.Google ScholarCross Ref
- Yifan Xu, Tianqi Fan, Mingye Xu, Long Zeng, and Yu Qiao. 2018. SpiderCNN: Deep learning on point sets with parameterized convolutional filters. In European Conference on Computer Vision (ECCV). 90--105.Google ScholarCross Ref
- Yaoqing Yang, Chen Feng, Yiru Shen, and Dong Tian. 2018. FoldingNet: Point cloud auto-encoder via deep grid deformation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 206--215.Google ScholarCross Ref
- Lequan Yu, Xianzhi Li, Chi-Wing Fu, Daniel Cohen-Or, and Pheng-Ann Heng. 2018. EC-Net: An edge-aware point set consolidation network. In European Conference on Computer Vision (ECCV). 398--414.Google ScholarCross Ref
- Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In European Conference on Computer Vision (ECCV). 818--833.Google Scholar
- Jianming Zhang, Sarah Adel Bargal, Zhe Lin, Jonathan Brandt, Xiaohui Shen, and Stan Sclaroff. 2018. Top-down neural attention by excitation backprop. International Journal Computer Vision 126, 10 (2018), 1084--1102.Google ScholarDigital Library
- Xiaoting Zhang, Xinyi Le, Athina Panotopoulou, Emily Whiting, and Charlie C. L. Wang. 2015. Perceptual models of preference in 3D printing direction. ACM Transactions on Graphics (SIGGRAPH Asia) 34, 6 (2015), 215:1--215:12.Google Scholar
- Hengshuang Zhao, Li Jiang, Chi-Wing Fu, and Jiaya Jia. 2019. PointWeb: Enhancing local neighborhood features for point cloud processing. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 5565--5573.Google ScholarCross Ref
- Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning deep features for discriminative localization. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2921--2929.Google ScholarCross Ref
Index Terms
- Unsupervised Detection of Distinctive Regions on 3D Shapes
Recommendations
Exploration of continuous variability in collections of 3D shapes
As large public repositories of 3D shapes continue to grow, the amount of shape variability in such collections also increases, both in terms of the number of different classes of shapes, as well as the geometric variability of shapes within each class. ...
Exploration of continuous variability in collections of 3D shapes
SIGGRAPH '11: ACM SIGGRAPH 2011 papersAs large public repositories of 3D shapes continue to grow, the amount of shape variability in such collections also increases, both in terms of the number of different classes of shapes, as well as the geometric variability of shapes within each class. ...
Distinctive regions of 3D surfaces
Selecting the most important regions of a surface is useful for shape matching and a variety of applications in computer graphics and geometric modeling. While previous research has analyzed geometric properties of meshes in isolation, we select regions ...
Comments