当前位置: X-MOL 学术Pattern Recogn. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Y-Autoencoders: Disentangling latent representations via sequential encoding
Pattern Recognition Letters ( IF 3.9 ) Pub Date : 2020-09-26 , DOI: 10.1016/j.patrec.2020.09.025
Massimiliano Patacchiola , Patrick Fox-Roberts , Edward Rosten

In the last few years there have been important advancements in disentangling latent representations using generative models, with the two dominant approaches being Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). However, standard Autoencoders (AEs) and closely related structures have remained popular because they are easy to train and adapt to different tasks. An interesting question is if we can achieve state-of-the-art latent disentanglement with AEs while retaining their good properties. We propose an answer to this question by introducing a new model called Y-Autoencoder (Y-AE). The structure and training procedure of a Y-AE enclose a representation into an implicit and an explicit part. The implicit part is similar to the output of an AE and the explicit part is strongly correlated with labels in the training set. The two parts are separated in the latent space by splitting the output of the encoder into two paths (forming a Y shape) before decoding and re-encoding. We then impose a number of losses, such as reconstruction loss, and a loss on dependence between the implicit and explicit parts. Additionally, the projection in the explicit manifold is monitored by a predictor, that is embedded in the encoder and trained end-to-end with no adversarial losses. We provide significant experimental results on various domains, such as separation of style and content, image-to-image translation, and inverse graphics.



中文翻译:

Y-Autoencoders:通过顺序编码解开潜在表示

在过去的几年中,在利用生成模型分解潜在表示方面取得了重要进展,其中两种主要方法是生成对抗网络(GAN)和变分自动编码器(VAE)。但是,标准的自动编码器(AE)和紧密相关的结构仍然很流行,因为它们易于训练并适应不同的任务。一个有趣的问题是,我们是否可以在保持其良好性能的同时,与AE实现最新的潜在解缠结。我们通过引入一种称为Y-Autoencoder(Y-AE)的新模型来提出这个问题的答案。Y-AE的结构和训练过程将表示形式分为隐性和显性部分。隐式部分类似于AE的输出,显式部分与训练集中的标签紧密相关。通过在解码和重新编码之前将编码器的输出分成两条路径(形成Y形),可以在潜在空间中将这两部分分开。然后,我们施加了许多损失,例如重建损失,以及对隐式部分和显式部分之间依赖性的损失。此外,显式歧管中的投影由预测器监控,该预测器嵌入编码器中并经过端对端训练,而不会带来任何对抗性损失。我们在各个领域提供了重要的实验结果,例如样式和内容的分离,图像到图像的转换以及逆图形。以及对隐性部分和显性部分之间依赖性的丧失。此外,显式歧管中的投影由预测器监控,该预测器嵌入编码器中并经过端对端训练,而不会带来任何对抗性损失。我们在各个领域提供了重要的实验结果,例如样式和内容的分离,图像到图像的转换以及逆图形。以及对隐性部分和显性部分之间依赖性的丧失。此外,显式歧管中的投影由预测器监控,该预测器嵌入编码器中并经过端对端训练,而不会带来任何对抗性损失。我们在各个领域提供了重要的实验结果,例如样式和内容的分离,图像到图像的转换以及逆图形。

更新日期:2020-10-02
down
wechat
bug