Efficient High-Resolution Image-to-Image Translation using Multi-Scale Gradient U-Net

Kumarapu Laxman; Shiv Ram Dubey; Baddam Kalyan; Satya Raj Vineel Kojjarapu

マルチスケールグラジエントU-Netを使用した効率的な高解像度の画像から画像への変換

最近、Conditional Generative Adversarial Network（Conditional GAN）は、いくつかの画像から画像への変換アプリケーションで非常に有望なパフォーマンスを示しています。ただし、これらの条件付きGANの使用は、256X256などの低解像度画像にかなり限定されています。Pix2Pix-HDは、高解像度画像合成に条件付きGANを利用する最近の試みです。この論文では、最大2048X1024の解像度の高解像度画像から画像への変換のためのマルチスケール勾配ベースのU-Net（MSG U-Net）モデルを提案します。提案されたモデルは、複数の弁別器から複数のスケールで単一のジェネレーターへの勾配の流れを可能にすることによって訓練されます。提案されたMSGU-Netアーキテクチャは、写実的な高解像度の画像から画像への変換につながります。さらに、提案されたモデルは、Pix2Pix-HDと比較して計算効率が高く、推論時間が2.5倍近く改善されています。 MSGU-Netモデルのコードはhttps://github.com/laxmaniron/MSG-U-Netで提供されています。

Recently, Conditional Generative Adversarial Network (Conditional GAN) have shown very promising performance in several image-to-image translation applications. However, the uses of these conditional GANs are quite limited to low-resolution images, such as 256X256.The Pix2Pix-HD is a recent attempt to utilize the conditional GAN for high-resolution image synthesis. In this paper, we propose a Multi-Scale Gradient based U-Net (MSG U-Net) model for high-resolution image-to-image translation up to 2048X1024 resolution. The proposed model is trained by allowing the flow of gradients from multiple-discriminators to a single generator at multiple scales. The proposed MSG U-Net architecture leads to photo-realistic high-resolution image-to-image translation. Moreover, the proposed model is computationally efficient as com-pared to the Pix2Pix-HD with an improvement in the inference time nearly by 2.5 times. We provide the code of MSG U-Net model at https://github.com/laxmaniron/MSG-U-Net.

updated: Thu May 27 2021 11:32:35 GMT+0000 (UTC)

published: Thu May 27 2021 11:32:35 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト