Spectral unmixing is a significant challenge in hyperspectral image processing. Existing unmixing methods utilize prior knowledge about the abundance distribution to solve the regularization optimization problem, where the difficulty lies in choosing appropriate prior knowledge and solving the complex regularization optimization problem. To solve these problems, we propose a hyperspectral conditional generative adversarial network (HyperGAN) method as a generic unmixing framework, based on the following assumption: the unmixing process from pixel to abundance can be regarded as a transformation of two modalities with an internal specific relationship. The proposed HyperGAN is composed of a generator and discriminator, the former completes the modal conversion from mixed hyperspectral pixel patch to the abundance of corresponding endmember of the central pixel and the latter is used to distinguish whether the distribution and structure of generated abundance are the same as the true ones. We propose hyperspectral image (HSI) Patch Transformer as the main component of the generator, which utilize adaptive attention score to capture the internal pixels correlation of the HSI patch and leverage the spatial-spectral information in a fine-grained way to achieve optimization of the unmixing process. Experiments on synthetic data and real hyperspectral data achieve impressive results compared to state-of-the-art competitors.