Learning multiplane images from single views with self-supervision

Gustavo Sutter P. Carvalho; Diogo C. Luvizon; Antonio Joia Neto; Andre G. C. Pacheco; Otavio A. B. Penatti

自己監視による単一ビューからのマルチプレーン画像の学習

すでにキャプチャされた画像から静的な新しいビューを生成することは、特に単一の入力画像に人や動く物体などの動的な部分がある場合、コンピュータビジョンとグラフィックスでは難しい作業です。この論文では、自己監視のための周期的なトレーニング戦略を通じて単一の画像からマルチプレーン画像表現を学習できるCycleMPIと呼ばれる新しいフレームワークを提案することによってこの問題に取り組みます。私たちのフレームワークはトレーニングにステレオデータを必要としないため、インターネットからの大量のビジュアルデータを使用してトレーニングできるため、非常に困難な場合でも一般化機能が向上します。私たちの方法は、監視のためにステレオデータを必要としませんが、ゼロショットシナリオの最先端に匹敵するステレオデータセットで結果に到達します。ビュー合成のためにRealEstate10Kおよびマネキンチャレンジデータセットでメソッドを評価し、PlacesIIデータセットで定性的な結果を示しました。

Generating static novel views from an already captured image is a hard task in computer vision and graphics, in particular when the single input image has dynamic parts such as persons or moving objects. In this paper, we tackle this problem by proposing a new framework, called CycleMPI, that is capable of learning a multiplane image representation from single images through a cyclic training strategy for self-supervision. Our framework does not require stereo data for training, therefore it can be trained with massive visual data from the Internet, resulting in a better generalization capability even for very challenging cases. Although our method does not require stereo data for supervision, it reaches results on stereo datasets comparable to the state of the art in a zero-shot scenario. We evaluated our method on RealEstate10K and Mannequin Challenge datasets for view synthesis and presented qualitative results on Places II dataset.

updated: Tue Oct 19 2021 07:42:28 GMT+0000 (UTC)

published: Mon Oct 18 2021 15:03:08 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト