当前位置: X-MOL 学术arXiv.cs.AI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
TreeGAN: Incorporating Class Hierarchy into Image Generation
arXiv - CS - Artificial Intelligence Pub Date : 2020-09-16 , DOI: arxiv-2009.07734
Ruisi Zhang and Luntian Mou and Pengtao Xie

Conditional image generation (CIG) is a widely studied problem in computer vision and machine learning. Given a class, CIG takes the name of this class as input and generates a set of images that belong to this class. In existing CIG works, for different classes, their corresponding images are generated independently, without considering the relationship among classes. In real-world applications, the classes are organized into a hierarchy and their hierarchical relationships are informative for generating high-fidelity images. In this paper, we aim to leverage the class hierarchy for conditional image generation. We propose two ways of incorporating class hierarchy: prior control and post constraint. In prior control, we first encode the class hierarchy, then feed it as a prior into the conditional generator to generate images. In post constraint, after the images are generated, we measure their consistency with the class hierarchy and use the consistency score to guide the training of the generator. Based on these two ideas, we propose a TreeGAN model which consists of three modules: (1) a class hierarchy encoder (CHE) which takes the hierarchical structure of classes and their textual names as inputs and learns an embedding for each class; the embedding captures the hierarchical relationship among classes; (2) a conditional image generator (CIG) which takes the CHE-generated embedding of a class as input and generates a set of images belonging to this class; (3) a consistency checker which performs hierarchical classification on the generated images and checks whether the generated images are compatible with the class hierarchy; the consistency score is used to guide the CIG to generate hierarchy-compatible images. Experiments on various datasets demonstrate the effectiveness of our method.



条件图像生成 (CIG) 是计算机视觉和机器学习中广泛研究的问题。给定一个类,CIG 将这个类的名称作为输入并生成属于这个类的一组图像。在现有的 CIG 作品中,对于不同的类,它们对应的图像是独立生成的,没有考虑类之间的关系。在实际应用中,类被组织成一个层次结构,它们的层次关系对于生成高保真图像是有用的。在本文中,我们的目标是利用类层次结构来生成条件图像。我们提出了两种合并类层次结构的方法:先验控制和后约束。在先验控制中,我们首先对类层次结构进行编码,然后将其作为先验输入条件生成器以生成图像。在后约束中,生成图像后,我们测量它们与类层次结构的一致性,并使用一致性分数来指导生成器的训练。基于这两个想法,我们提出了一个 TreeGAN 模型,它由三个模块组成:(1)一个类层次编码器(CHE),它以类的层次结构及其文本名称作为输入,并为每个类学习嵌入;嵌入捕获类之间的层次关系;(2) 条件图像生成器 (CIG),它以 CHE 生成的某个类的嵌入作为输入,生成属于该类的一组图像;(3) 一致性检查器,对生成的图像进行层次分类,并检查生成的图像是否与类层次结构兼容;一致性分数用于指导 CIG 生成层次兼容的图像。在各种数据集上的实验证明了我们方法的有效性。