当前位置: X-MOL 学术ACM Trans. Multimed. Comput. Commun. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
3D Tensor Auto-encoder with Application to Video Compression
ACM Transactions on Multimedia Computing, Communications, and Applications ( IF 5.2 ) Pub Date : 2021-05-12 , DOI: 10.1145/3431768
Yang Li 1 , Guangcan Liu 2 , Yubao Sun 2 , Qingshan Liu 2 , Shengyong Chen 3
Affiliation  

Auto-encoder has been widely used to compress high-dimensional data such as the images and videos. However, the traditional auto-encoder network needs to store a large number of parameters. Namely, when the input data is of dimension n , the number of parameters in an auto-encoder is in general O ( n ). In this article, we introduce a network structure called 3D Tensor Auto-Encoder (3DTAE). Unlike the traditional auto-encoder, in which a video is represented as a vector, our 3DTAE considers videos as 3D tensors to directly pass tensor objects through the network. The weights of each layer are represented by three small matrices, and thus the number of parameters in 3DTAE is just O ( n 1/3). The compact nature of 3DTAE fits well the needs of video compression. Given an ensemble of high-dimensional videos, we represent them as 3DTAE networks plus some small core tensors, and we further quantize the network parameters and the core tensors to get the final compressed data. Experimental results verify the efficiency of 3DTAE.

中文翻译:

用于视频压缩的 3D 张量自动编码器

自动编码器已被广泛用于压缩图像和视频等高维数据。然而,传统的自编码器网络需要存储大量的参数。即,当输入数据有维度时n,自动编码器中的参数数量通常是(n)。在本文中,我们介绍了一种称为 3D Tensor Auto-Encoder (3DTAE) 的网络结构。与将视频表示为向量的传统自动编码器不同,我们的 3DTAE 将视频视为 3D 张量以直接通过网络传递张量对象。每一层的权重由三个小矩阵表示,因此 3DTAE 中的参数个数就是(n1/3)。3DTAE 的紧凑特性非常适合视频压缩的需求。给定一个高维视频的集合,我们将它们表示为 3DTAE 网络加上一些小的核心张量,我们进一步量化网络参数和核心张量以获得最终的压缩数据。实验结果验证了3DTAE的效率。
更新日期:2021-05-12
down
wechat
bug