当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
PVNAS: 3D Neural Architecture Search With Point-Voxel Convolution.
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 20.8 ) Pub Date : 2022-10-04 , DOI: 10.1109/tpami.2021.3109025
Zhijian Liu , Haotian Tang , Shengyu Zhao , Kevin Shao , Song Han

3D neural networks are widely used in real-world applications (e.g., AR/VR headsets, self-driving cars). They are required to be fast and accurate; however, limited hardware resources on edge devices make these requirements rather challenging. Previous work processes 3D data using either voxel-based or point-based neural networks, but both types of 3D models are not hardware-efficient due to the large memory footprint and random memory access. In this paper, we study 3D deep learning from the efficiency perspective. We first systematically analyze the bottlenecks of previous 3D methods. We then combine the best from point-based and voxel-based models together and propose a novel hardware-efficient 3D primitive, Point-Voxel Convolution (PVConv). We further enhance this primitive with the sparse convolution to make it more effective in processing large (outdoor) scenes. Based on our designed 3D primitive, we introduce 3D Neural Architecture Search (3D-NAS) to explore the best 3D network architecture given a resource constraint. We evaluate our proposed method on six representative benchmark datasets, achieving state-of-the-art performance with 1.8-23.7× measured speedup. Furthermore, our method has been deployed to the autonomous racing vehicle of MIT Driverless, achieving larger detection range, higher accuracy and lower latency.

中文翻译:

PVNAS:使用点体素卷积的 3D 神经结构搜索。

3D 神经网络广泛用于现实世界的应用(例如,AR/VR 耳机、自动驾驶汽车)。他们需要快速准确;然而,边缘设备上有限的硬件资源使得这些要求相当具有挑战性。以前的工作使用基于体素或基于点的神经网络处理 3D 数据,但由于内存占用量大和随机内存访问,这两种类型的 3D 模型都不是硬件高效的。在本文中,我们从效率的角度研究 3D 深度学习。我们首先系统地分析了以前 3D 方法的瓶颈。然后,我们将基于点的模型和基于体素的模型的最佳结合在一起,并提出了一种新颖的硬件高效 3D 基元,即点体素卷积 (PVConv)。我们用稀疏卷积进一步增强这个基元,使其在处理大型(室外)场景时更有效。基于我们设计的 3D 基元,我们引入了 3D 神经架构搜索 (3D-NAS) 来探索给定资源限制的最佳 3D 网络架构。我们在六个具有代表性的基准数据集上评估了我们提出的方法,以 1.8-23.7 倍的测量加速实现了最先进的性能。此外,我们的方法已经部署到麻省理工无人驾驶的自动驾驶赛车上,实现了更大的检测范围、更高的准确性和更低的延迟。以 1.8-23.7 倍的实测加速实现最先进的性能。此外,我们的方法已经部署到麻省理工无人驾驶的自动驾驶赛车上,实现了更大的检测范围、更高的准确性和更低的延迟。以 1.8-23.7 倍的实测加速实现最先进的性能。此外,我们的方法已经部署到麻省理工无人驾驶的自动驾驶赛车上,实现了更大的检测范围、更高的准确性和更低的延迟。
更新日期:2021-09-01
down
wechat
bug