CD-GAN: COMMONSENSE-DRIVEN GENERATIVE ADVERSARIAL NETWORK WITH HIERARCHICAL REFINEMENT FOR TEXT-TO-IMAGE SYNTHESIS

CD-GAN: Commonsense-Driven Generative Adversarial Network with Hierarchical Refinement for Text-to-Image Synthesis

CD-GAN: Commonsense-Driven Generative Adversarial Network with Hierarchical Refinement for Text-to-Image Synthesis

Blog Article

Synthesizing vivid images with descriptive texts is gradually emerging as a frontier cross-domain generation task.However, it is obviously inadequate to generate the high-quality image with one single sentence accurately due to the information asymmetry between modalities, which needs external knowledge to balance the process.Moreover, the cloth nappies ackermans limited description of the entities in the sentence cannot guarantee the semantic consistency between text and generated image, causing the deficiency of details in foreground and background.Here, we propose a commonsense-driven generative adversarial network to generate photo-realistic images angilina white depending on entity-related commonsense knowledge.

Commonsense-driven generative adversarial network contains 2 key commonsense-based modules: (a) Entity semantic augment is designed to enhance entity semantics with common sense for abating the information asymmetry, and (b) adaptive entity refinement is used to generate the high-resolution image guided by various commonsense knowledges in multistage for keeping text-image consistency.We demonstrated extensive synthetic cases on the widely used CUB-birds (Caltech-UCSD Birds-200-2011) dataset, where our model achieves competitive results compared to the other state-of-the-art models.

Report this page