This paper aims to disentangle the latent space in cVAE into the spatial structure and the style code, which are complementary to each other, with one of them z_s being label relevant and the other z_u irrelevant. The generator is built by a connected encoder-decoder and a label condition mapping network. Depending on whether the label is related with the spatial structure, the output z_s from the condition mapping network is used either as a style code or a spatial structure code. The encoder provides the label irrelevant posterior from which z_u is sampled. The decoder employs z_s and z_u in each layer by adaptive normalization like SPADE or AdaIN. Extensive experiments on two datasets with different types of labels show the effectiveness of our method.
updated: Wed Jul 15 2020 09:02:56 GMT+0000 (UTC)
published: Tue Oct 29 2019 03:14:13 GMT+0000 (UTC)