Drop the GAN: In Defense of Patches Nearest Neighbors as Single Image Generative Models

Niv Granot; Ben Feinstein; Assaf Shocher; Shai Bagon; Michal Irani

GANを削除する：単一画像生成モデルとしての最近傍パッチの防御

単一画像生成モデルは、単一画像内のパッチの分布をキャプチャすることにより、合成および操作タスクを実行します。これらのタスクの従来の（ディープラーニング以前の）一般的なアプローチは、入力と生成された出力の間のパッチの類似性を最大化する最適化プロセスに基づいています。しかし、最近、シングルイメージGANは、そのような操作タスクの優れたソリューションとしてだけでなく、注目に値する新しい生成タスクの両方にも導入されました。それらの印象にもかかわらず、単一画像GANは、各画像および各タスクに長いトレーニング時間（通常は数時間）を必要とします。それらはしばしばアーティファクトに悩まされ、モードの崩壊などの最適化の問題を起こしやすいです。このホワイトペーパーでは、これらすべてのタスクを、トレーニングなしで、数秒以内に、統一された驚くほどシンプルなフレームワークで実行できることを示します。「古き良き」パッチベースの方法を再検討し、最適化のない新しいフレームワークにキャストします。最初の大まかな推測から始めて、パッチ最近傍検索を使用して詳細を大まかに細かく調整します。これにより、GANよりも優れた高速でランダムな新規画像を生成できます。さらに、画像の編集と再シャッフル、さまざまなサイズへのリターゲティング、構造の類似性、画像のコラージュ、条件付き修復の新たに導入されたタスクなど、幅広いアプリケーションを示します。私たちの方法は高速であるだけでなく（GANよりも×10 ^ 3-×10 ^ 4）、以前のどのアプローチよりも優れた結果（定量的および定性的評価によって確認）、アーティファクトが少なく、より現実的なグローバル構造を生成します（GANかどうかにかかわらず） -ベースまたは従来のパッチベース）。

Single image generative models perform synthesis and manipulation tasks by capturing the distribution of patches within a single image. The classical (pre Deep Learning) prevailing approaches for these tasks are based on an optimization process that maximizes patch similarity between the input and generated output. Recently, however, Single Image GANs were introduced both as a superior solution for such manipulation tasks, but also for remarkable novel generative tasks. Despite their impressiveness, single image GANs require long training time (usually hours) for each image and each task. They often suffer from artifacts and are prone to optimization issues such as mode collapse. In this paper, we show that all of these tasks can be performed without any training, within several seconds, in a unified, surprisingly simple framework. We revisit and cast the "good-old" patch-based methods into a novel optimization-free framework. We start with an initial coarse guess, and then simply refine the details coarse-to-fine using patch-nearest-neighbor search. This allows generating random novel images better and much faster than GANs. We further demonstrate a wide range of applications, such as image editing and reshuffling, retargeting to different sizes, structural analogies, image collage and a newly introduced task of conditional inpainting. Not only is our method faster (×10^3-×10^4 than a GAN), it produces superior results (confirmed by quantitative and qualitative evaluation), less artifacts and more realistic global structure than any of the previous approaches (whether GAN-based or classical patch-based).

updated: Tue Aug 24 2021 08:07:56 GMT+0000 (UTC)

published: Mon Mar 29 2021 12:20:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト