We study the problem of multimodal generative modelling of images based on generative adversarial networks (GANs). Despite the success of existing methods, they often ignore the underlying structure of vision data or its multimodal generation characteristics. To address this problem, we introduce the Dirichlet prior for multimodal image generation, which leads to a new Latent Dirichlet Allocation based GAN (LDAGAN). In detail, for the generative process modelling, LDAGAN defines a generative mode for each sample, determining which generative sub-process it belongs to. For the adversarial training, LDAGAN derives a variational expectation-maximization (VEM) algorithm to estimate model parameters. Experimental results on real-world datasets have demonstrated the outstanding performance of LDAGAN over other existing GANs.
updated: Wed Nov 06 2019 14:56:54 GMT+0000 (UTC)
published: Mon Dec 17 2018 01:21:20 GMT+0000 (UTC)