Unsupervised Learning of Depth and Depth-of-Field Effect from Natural Images with Aperture Rendering Generative Adversarial Networks

Takuhiro Kaneko

アパーチャレンダリング生成的敵対的ネットワークを使用した自然画像からの深さおよび被写界深度効果の教師なし学習

2D投影された自然画像から3Dの世界を理解することは、コンピュータービジョンとグラフィックスの基本的な課題です。最近、教師なし学習アプローチは、データ収集におけるその利点のためにかなりの注目を集めています。ただし、トレーニングの制限を緩和するために、一般的な方法では、視点の分布（たとえば、さまざまな視点の画像を含むデータセット）またはオブジェクトの形状（たとえば、対称オブジェクト）の仮定を課す必要があります。これらの仮定は、多くの場合、アプリケーションを制限します。たとえば、同様の視点からキャプチャされた非剛体または画像（たとえば、花や鳥の画像）への適用は依然として課題です。これらのアプローチを補完するために、GANの上にアパーチャレンダリングを装備するアパーチャレンダリング生成敵対的ネットワーク（AR-GAN）を提案し、フォーカスキューを採用してラベルのない自然画像の被写界深度（DoF）効果を学習します。教師なし設定によって引き起こされるあいまいさ（つまり、滑らかなテクスチャと焦点が合っていないぼけの間、および前景と背景のぼけの間のあいまいさ）に対処するために、ジェネレータが多様なDoFを生成しながら実際の画像分布を学習できるようにするDoF混合学習を開発します。画像。さらに、学習の方向性を導く前に、中心的な焦点を考案します。実験では、花、鳥、顔の画像などのさまざまなデータセットでAR-GANの有効性を示し、他の3D表現学習GANに組み込むことで移植性を示し、浅いDoFレンダリングでの適用性を検証します。

Understanding the 3D world from 2D projected natural images is a fundamental challenge in computer vision and graphics. Recently, an unsupervised learning approach has garnered considerable attention owing to its advantages in data collection. However, to mitigate training limitations, typical methods need to impose assumptions for viewpoint distribution (e.g., a dataset containing various viewpoint images) or object shape (e.g., symmetric objects). These assumptions often restrict applications; for instance, the application to non-rigid objects or images captured from similar viewpoints (e.g., flower or bird images) remains a challenge. To complement these approaches, we propose aperture rendering generative adversarial networks (AR-GANs), which equip aperture rendering on top of GANs, and adopt focus cues to learn the depth and depth-of-field (DoF) effect of unlabeled natural images. To address the ambiguities triggered by unsupervised setting (i.e., ambiguities between smooth texture and out-of-focus blurs, and between foreground and background blurs), we develop DoF mixture learning, which enables the generator to learn real image distribution while generating diverse DoF images. In addition, we devise a center focus prior to guiding the learning direction. In the experiments, we demonstrate the effectiveness of AR-GANs in various datasets, such as flower, bird, and face images, demonstrate their portability by incorporating them into other 3D representation learning GANs, and validate their applicability in shallow DoF rendering.

updated: Thu Jun 24 2021 14:15:50 GMT+0000 (UTC)

published: Thu Jun 24 2021 14:15:50 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト