EpiGRAF: Rethinking training of 3D GANs

Ivan Skorokhodov; Sergey Tulyakov; Yiqun Wang; Peter Wonka

EpiGRAF：3DGANのトレーニングを再考する

生成モデリングのごく最近の傾向は、2D画像コレクションから3D対応のジェネレーターを構築することです。 3Dバイアスを誘発するために、このようなモデルは通常、高解像度で使用するにはコストがかかるボリュームレンダリングに依存しています。過去数か月の間に、純粋な3Dジェネレーターから生成された低解像度画像（または特徴テンソル）をアップサンプリングするために別の2Dデコーダーをトレーニングすることにより、このスケーリングの問題に対処する10以上の作品が登場しました。ただし、このソリューションにはコストがかかります。マルチビューの一貫性が失われるだけでなく（つまり、カメラが移動すると形状とテクスチャが変化する）、忠実度の低いジオメトリも学習します。この作業では、モデルをパッチごとにトレーニングするというまったく異なるルートに従うことで、SotA画質の高解像度3Dジェネレーターを取得できることを示します。この最適化スキームを2つの方法で再検討し、改善します。まず、さまざまな比率と空間位置のパッチで機能するように、位置とスケールを意識した弁別器を設計します。次に、アニーリングされたベータ分布に基づいてパッチサンプリング戦略を変更し、トレーニングを安定させ、収束を加速します。結果として得られたEpiGRAFという名前のモデルは、効率的で高解像度の純粋な3Dジェネレーターであり、256^2および512^2の解像度で4つのデータセット（この作業で紹介した2つ）でテストします。最先端の画質、忠実度の高いジオメトリを取得し、アップサンプラーベースの対応物よりも約2.5倍高速にトレーニングします。プロジェクトのウェブサイト：https：//universome.github.io/epigraf。

A very recent trend in generative modeling is building 3D-aware generators from 2D image collections. To induce the 3D bias, such models typically rely on volumetric rendering, which is expensive to employ at high resolutions. During the past months, there appeared more than 10 works that address this scaling issue by training a separate 2D decoder to upsample a low-resolution image (or a feature tensor) produced from a pure 3D generator. But this solution comes at a cost: not only does it break multi-view consistency (i.e. shape and texture change when the camera moves), but it also learns the geometry in a low fidelity. In this work, we show that it is possible to obtain a high-resolution 3D generator with SotA image quality by following a completely different route of simply training the model patch-wise. We revisit and improve this optimization scheme in two ways. First, we design a location- and scale-aware discriminator to work on patches of different proportions and spatial positions. Second, we modify the patch sampling strategy based on an annealed beta distribution to stabilize training and accelerate the convergence. The resulted model, named EpiGRAF, is an efficient, high-resolution, pure 3D generator, and we test it on four datasets (two introduced in this work) at 256^2 and 512^2 resolutions. It obtains state-of-the-art image quality, high-fidelity geometry and trains ≈ 2.5 × faster than the upsampler-based counterparts. Project website: https://universome.github.io/epigraf.

updated: Tue Jun 21 2022 17:08:23 GMT+0000 (UTC)

published: Tue Jun 21 2022 17:08:23 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト