当前位置: X-MOL 学术Neural Comput. & Applic. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Autoencoder-based image processing framework for object appearance modifications
Neural Computing and Applications ( IF 6 ) Pub Date : 2020-05-28 , DOI: 10.1007/s00521-020-04976-7
Krzysztof Ślot , Paweł Kapusta , Jacek Kucharski

The presented paper introduces a novel method for enabling appearance modifications for complex image objects. Qualitative visual object properties, quantified using appropriately derived visual attribute descriptors, are subject to alterations. We adopt a basic convolutional autoencoder as a framework for the proposed attribute modification algorithm, which is composed of the following three steps. The algorithm begins with the extraction of attribute-related information from autoencoder’s latent representation of an input image, by means of supervised principal component analysis. Next, appearance alteration is performed in the derived feature space (referred to as ‘attribute-space’), based on appropriately identified mappings between quantitative descriptors of image attributes and attribute-space features. Finally, modified attribute vectors are transformed back to latent representation, and output image is reconstructed in the decoding part of an autoencoder. The method has been evaluated using two datasets: images of simple objects—digits from MNIST handwritten-digit dataset and images of complex objects—faces from CelebA dataset. In the former case, two qualitative visual attributes of digit images have been selected for modifications: slant and aspect ratio, whereas in the latter case, aspect ratio of face oval was subject to alterations. Evaluation results prove, both in qualitative and quantitative terms, that the proposed framework offers a promising tool for visual object editing.



中文翻译:

基于自动编码器的图像处理框架,用于对象外观修改

本论文介绍了一种新颖的方法,可以对复杂的图像对象进行外观修改。使用适当派生的视觉属性描述符进行量化的定性视觉对象属性可能会发生更改。我们采用基本的卷积自动编码器作为提出的属性修改算法的框架,该算法由以下三个步骤组成。该算法开始于通过监督主成分分析从自动编码器对输入图像的潜在表示中提取属性相关信息。接下来,基于适当识别的图像属性的定量描述符与属性空间特征之间的映射,在派生的特征空间(称为“属性空间”)中执行外观更改。最后,修改后的属性向量被转换回潜在表示,并在自动编码器的解码部分重构输出图像。该方法已使用两个数据集进行了评估:简单对象的图像(来自MNIST手写数字数据集的数字)和复杂对象的图像(来自CelebA数据集的面部)。在前一种情况下,已选择了数字图像的两个定性视觉属性进行修改:倾斜度和纵横比,而在后一种情况下,面部椭圆形的纵横比可能会发生变化。评估结果从定性和定量两个方面证明了所提出的框架为视觉对象编辑提供了一个有前途的工具。简单对象的图像-来自MNIST手写数字数据集的数字和复杂对象的图像-来自CelebA数据集的面部。在前一种情况下,已选择了数字图像的两个定性视觉属性进行修改:倾斜度和纵横比,而在后一种情况下,面部椭圆形的纵横比可能会发生变化。评估结果从定性和定量两个方面证明了所提出的框架为视觉对象编辑提供了有希望的工具。简单对象的图像-来自MNIST手写数字数据集的数字和复杂对象的图像-来自CelebA数据集的面部。在前一种情况下,已选择了数字图像的两个定性视觉属性进行修改:倾斜度和纵横比,而在后一种情况下,面部椭圆形的纵横比可能会发生变化。评估结果从定性和定量两个方面证明了所提出的框架为视觉对象编辑提供了一个有前途的工具。

更新日期:2020-05-28
down
wechat
bug