当前位置: X-MOL 学术Pattern Recogn. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Text-to-image via mask anchor points
Pattern Recognition Letters ( IF 3.9 ) Pub Date : 2020-02-13 , DOI: 10.1016/j.patrec.2020.02.013
Samah S. Baraheem , Tam V. Nguyen

Text-to-image is a process of generating an image from the input text. It has a variety of applications in art generation, computer-aided design, and data synthesis. In this paper, we propose a new framework which leverages mask anchor points to incorporate two major steps in the image synthesis. In the first step, the mask image is generated from the input text and the mask dataset. In the second step, the mask image is fed into the state-of-the-art mask-to-image generator. Note that the mask image captures the semantic information and the location relationship via the anchor points. We also developed a user-friendly interface which helps parse the input text into the meaningful semantic objects. As a result, our framework is able to produce clear, reasonable, and more realistic images. The experiments on the most challenging COCO-stuff dataset illustrate the superiority of our proposed approach over the previous state of the arts.



中文翻译:

通过遮罩锚点进行文本到图像

文本到图像是根据输入文本生成图像的过程。它在艺术创作,计算机辅助设计和数据合成中具有多种应用程序。在本文中,我们提出了一个新框架,该框架利用蒙版锚点将图像合成中的两个主要步骤纳入其中。第一步,根据输入文本和蒙版数据集生成蒙版图像。在第二步中,将遮罩图像输入到最新的遮罩图像生成器中。注意,掩模图像通过锚点捕获语义信息和位置关系。我们还开发了一个用户友好的界面,该界面有助于将输入文本解析为有意义的语义对象。结果,我们的框架能够产生清晰,合理,更逼真的图像。

更新日期:2020-03-07
down
wechat
bug