当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
HARP-Net: Hyper-Autoencoded Reconstruction Propagation\\for Scalable Neural Audio Coding
arXiv - CS - Sound Pub Date : 2021-07-22 , DOI: arxiv-2107.10843
Darius Petermann, Seungkwon Beack, Minje Kim

An autoencoder-based codec employs quantization to turn its bottleneck layer activation into bitstrings, a process that hinders information flow between the encoder and decoder parts. To circumvent this issue, we employ additional skip connections between the corresponding pair of encoder-decoder layers. The assumption is that, in a mirrored autoencoder topology, a decoder layer reconstructs the intermediate feature representation of its corresponding encoder layer. Hence, any additional information directly propagated from the corresponding encoder layer helps the reconstruction. We implement this kind of skip connections in the form of additional autoencoders, each of which is a small codec that compresses the massive data transfer between the paired encoder-decoder layers. We empirically verify that the proposed hyper-autoencoded architecture improves perceptual audio quality compared to an ordinary autoencoder baseline.

中文翻译:

HARP-Net:超自动编码重建传播\\用于可扩展神经音频编码

基于自动编码器的编解码器采用量化将其瓶颈层激活转换为位串,这一过程会阻碍编码器和解码器部分之间的信息流。为了规避这个问题,我们在相应的一对编码器-解码器层之间使用了额外的跳过连接。假设是,在镜像自动编码器拓扑中,解码器层重建其对应编码器层的中间特征表示。因此,从相应的编码器层直接传播的任何附加信息都有助于重建。我们以附加自动编码器的形式实现这种跳过连接,每个自动编码器都是一个小型编解码器,可压缩成对的编码器-解码器层之间的大量数据传输。
更新日期:2021-07-23
down
wechat
bug