One-shot Ultra-high-Resolution Generative Adversarial Network That Synthesizes 16K Images On A Single GPU

Junseok Oh; Donghwee Yoon; Injung Kim

単一の GPU で 16K 画像を合成するワンショットの超高解像度敵対的生成ネットワーク

私たちは、単一のトレーニング画像から非反復的な 16K (16, 384 x 8, 640) 画像を生成し、単一のコンシューマ GPU でトレーニング可能なワンショットの超高解像度敵対的生成ネットワーク (OUR-GAN) フレームワークを提案します。 OUR-GAN は、視覚的に妥当で形状が変化する初期画像を低解像度で生成し、超解像によって詳細を追加することで徐々に解像度を高めます。 OUR-GAN は実際の超高解像度 (UHR) 画像から学習するため、微細なディテールと長距離のコヒーレンスを備えた大きな形状を合成できます。これは、相対的に学習したパッチ分布に依存する従来の生成モデルでは達成することが困難です。小さな画像。 OUR-GAN は、シームレスなサブ領域ごとの超解像度を通じて UHR 画像を部分ごとに合成するため、12.5 GB の GPU メモリで高品質の 16K 画像を合成でき、わずか 4.29 GB で 4K 画像を合成できます。さらに、OUR-GAN は垂直位置畳み込みを適用することで多様性を維持しながら視覚的な一貫性を向上させます。 ST4K および RAISE データセットの実験では、OUR-GAN は、ベースラインのワンショット合成モデルと比較して、忠実度、視覚的一貫性、および多様性が向上しました。私たちの知る限り、OUR-GAN は、単一のコンシューマ GPU 上で非反復的な UHR 画像を生成する初のワンショット画像シンセサイザーです。合成された画像サンプルは https://our-gan.github.io で提供されます。

We propose a one-shot ultra-high-resolution generative adversarial network (OUR-GAN) framework that generates non-repetitive 16K (16, 384 x 8, 640) images from a single training image and is trainable on a single consumer GPU. OUR-GAN generates an initial image that is visually plausible and varied in shape at low resolution, and then gradually increases the resolution by adding detail through super-resolution. Since OUR-GAN learns from a real ultra-high-resolution (UHR) image, it can synthesize large shapes with fine details and long-range coherence, which is difficult to achieve with conventional generative models that rely on the patch distribution learned from relatively small images. OUR-GAN can synthesize high-quality 16K images with 12.5 GB of GPU memory and 4K images with only 4.29 GB as it synthesizes a UHR image part by part through seamless subregion-wise super-resolution. Additionally, OUR-GAN improves visual coherence while maintaining diversity by applying vertical positional convolution. In experiments on the ST4K and RAISE datasets, OUR-GAN exhibited improved fidelity, visual coherency, and diversity compared with the baseline one-shot synthesis models. To the best of our knowledge, OUR-GAN is the first one-shot image synthesizer that generates non-repetitive UHR images on a single consumer GPU. The synthesized image samples are presented at https://our-gan.github.io.

updated: Mon Aug 28 2023 04:52:53 GMT+0000 (UTC)

published: Mon Feb 28 2022 13:48:41 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト