A Variational U-Net for Conditional Appearance and Shape Generation
P. Esser, E. Sutter, and B. Ommer. (2018)cite arxiv:1804.04694Comment: CVPR 2018 (Spotlight). Project Page at https://compvis.github.io/vunet/.
Abstract
Deep generative models have demonstrated great performance in image
synthesis. However, results deteriorate in case of spatial deformations, since
they generate images of objects directly, rather than modeling the intricate
interplay of their inherent shape and appearance. We present a conditional
U-Net for shape-guided image generation, conditioned on the output of a
variational autoencoder for appearance. The approach is trained end-to-end on
images, without requiring samples of the same object with varying pose or
appearance. Experiments show that the model enables conditional image
generation and transfer. Therefore, either shape or appearance can be retained
from a query image, while freely altering the other. Moreover, appearance can
be sampled due to its stochastic latent representation, while preserving shape.
In quantitative and qualitative experiments on COCO, DeepFashion, shoes,
Market-1501 and handbags, the approach demonstrates significant improvements
over the state-of-the-art.
%0 Generic
%1 esser2018variational
%A Esser, Patrick
%A Sutter, Ekaterina
%A Ommer, Björn
%D 2018
%K Learning Variational
%T A Variational U-Net for Conditional Appearance and Shape Generation
%U http://arxiv.org/abs/1804.04694
%X Deep generative models have demonstrated great performance in image
synthesis. However, results deteriorate in case of spatial deformations, since
they generate images of objects directly, rather than modeling the intricate
interplay of their inherent shape and appearance. We present a conditional
U-Net for shape-guided image generation, conditioned on the output of a
variational autoencoder for appearance. The approach is trained end-to-end on
images, without requiring samples of the same object with varying pose or
appearance. Experiments show that the model enables conditional image
generation and transfer. Therefore, either shape or appearance can be retained
from a query image, while freely altering the other. Moreover, appearance can
be sampled due to its stochastic latent representation, while preserving shape.
In quantitative and qualitative experiments on COCO, DeepFashion, shoes,
Market-1501 and handbags, the approach demonstrates significant improvements
over the state-of-the-art.
@misc{esser2018variational,
abstract = {Deep generative models have demonstrated great performance in image
synthesis. However, results deteriorate in case of spatial deformations, since
they generate images of objects directly, rather than modeling the intricate
interplay of their inherent shape and appearance. We present a conditional
U-Net for shape-guided image generation, conditioned on the output of a
variational autoencoder for appearance. The approach is trained end-to-end on
images, without requiring samples of the same object with varying pose or
appearance. Experiments show that the model enables conditional image
generation and transfer. Therefore, either shape or appearance can be retained
from a query image, while freely altering the other. Moreover, appearance can
be sampled due to its stochastic latent representation, while preserving shape.
In quantitative and qualitative experiments on COCO, DeepFashion, shoes,
Market-1501 and handbags, the approach demonstrates significant improvements
over the state-of-the-art.},
added-at = {2018-12-15T04:33:31.000+0100},
author = {Esser, Patrick and Sutter, Ekaterina and Ommer, Björn},
biburl = {https://www.bibsonomy.org/bibtex/2a442078a910626cc446781c2a98de9b7/zezhou},
description = {1804.04694.pdf},
interhash = {44307de9786c1f6004dfaa1571077e64},
intrahash = {a442078a910626cc446781c2a98de9b7},
keywords = {Learning Variational},
note = {cite arxiv:1804.04694Comment: CVPR 2018 (Spotlight). Project Page at https://compvis.github.io/vunet/},
timestamp = {2018-12-15T04:33:31.000+0100},
title = {A Variational U-Net for Conditional Appearance and Shape Generation},
url = {http://arxiv.org/abs/1804.04694},
year = 2018
}