当前位置: X-MOL 学术IEEE J. Sel. Top. Signal Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Adaptive Deep Learning-Based Point Cloud Geometry Coding
IEEE Journal of Selected Topics in Signal Processing ( IF 7.5 ) Pub Date : 2020-12-25 , DOI: 10.1109/jstsp.2020.3047520
Andre F. R. Guarda , Nuno M. M. Rodrigues , Fernando Pereira

Point clouds are a very rich 3D visual representation model, which has become increasingly appealing for multimedia applications with immersion, interaction and realism requirements. Due to different acquisition and creation conditions as well as target applications, point clouds’ characteristics may be very diverse, notably on their density. While geographical information systems or autonomous driving applications may use rather sparse point clouds, cultural heritage or virtual reality applications typically use denser point clouds to more accurately represent objects and people. Naturally, to offer immersion and realism, point clouds need a rather large number of points, thus asking for the development of efficient coding solutions. The use of deep learning models for coding purposes has recently gained relevance, with latest developments in image coding achieving state-of-the-art performance, thus making natural the adoption of this technology also for point cloud coding. This paper presents a novel deep learning-based solution for point cloud geometry coding which is able to efficiently adapt to the content's characteristics. The proposed coding solution divides the point cloud into 3D blocks and selects the most suitable available deep learning coding model to code each block, thus maximizing the compression performance. In comparison to the state-of-the-art MPEG G-PCC Trisoup standard, the proposed coding solution offers average quality gains up to 4.9 and 5.7 dB for PSNR D1 and PSNR D2, respectively.

中文翻译:

基于自适应深度学习的点云几何编码

点云是一种非常丰富的3D视觉表示模型,对于具有沉浸,交互和真实感要求的多媒体应用程序,它已经越来越有吸引力。由于获取和创建条件以及目标应用程序的不同,点云的特性可能会非常不同,尤其是在密度上。虽然地理信息系统或自动驾驶应用程序可能使用相当稀疏的点云,但文化遗产或虚拟现实应用程序通常使用更密集的点云来更准确地表示对象和人。自然地,为了提供沉浸感和真实感,点云需要大量的点,因此要求开发高效的编码解决方案。最近,将深度学习模型用于编码目的已变得越来越重要,图像编码的最新发展实现了最先进的性能,因此自然也将这种技术也用于点云编码。本文提出了一种新颖的基于深度学习的点云几何编码解决方案,该解决方案能够有效地适应内容的特征。提出的编码解决方案将点云分为3D块,并选择最合适的可用深度学习编码模型对每个块进行编码,从而使压缩性能最大化。与最新的MPEG G-PCC Trisoup标准相比,建议的编码解决方案分别为PSNR D1和PSNR D2提供了高达4.9 dB和5.7 dB的平均质量增益。本文提出了一种新颖的基于深度学习的点云几何编码解决方案,该解决方案能够有效地适应内容的特征。提出的编码解决方案将点云分为3D块,并选择最合适的可用深度学习编码模型对每个块进行编码,从而使压缩性能最大化。与最新的MPEG G-PCC Trisoup标准相比,建议的编码解决方案分别为PSNR D1和PSNR D2提供了高达4.9 dB和5.7 dB的平均质量增益。本文提出了一种新颖的基于深度学习的点云几何编码解决方案,该解决方案能够有效地适应内容的特征。提出的编码解决方案将点云分为3D块,并选择最合适的可用深度学习编码模型对每个块进行编码,从而使压缩性能最大化。与最新的MPEG G-PCC Trisoup标准相比,建议的编码解决方案分别为PSNR D1和PSNR D2提供了高达4.9 dB和5.7 dB的平均质量增益。从而最大化压缩性能。与最新的MPEG G-PCC Trisoup标准相比,建议的编码解决方案分别为PSNR D1和PSNR D2提供了高达4.9 dB和5.7 dB的平均质量增益。从而最大化压缩性能。与最新的MPEG G-PCC Trisoup标准相比,建议的编码解决方案分别为PSNR D1和PSNR D2提供了高达4.9 dB和5.7 dB的平均质量增益。
更新日期:2021-02-23
down
wechat
bug