Unsupervised OmniMVS: Efficient Omnidirectional Depth Inference via Establishing Pseudo-Stereo Supervision

Zisong Chen; Chunyu Lin; Lang Nie; Kang Liao; Yao Zhao

教師なし OmniMVS: 疑似ステレオ監視の確立による効率的な全方向深度推論

無指向性マルチビューステレオ (MVS) ビジョンは、その超広視野 (FoV) が魅力的で、マシンが 360° 3D 周囲を認識できるようにします。ただし、既存のソリューションでは、監視のために高価で高密度の深度ラベルが必要になるため、実際のアプリケーションでは実用的ではありません。この論文では、複数の魚眼画像に基づく最初の教師なし全方向 MVS フレームワークを提案します。この目的のために、すべての画像を仮想ビューの中心に投影し、2 組の背中合わせの魚眼画像から球形ジオメトリを使用して 2 つのパノラマ画像を合成します。 2 つの 360° 画像は、特別なポーズを持つステレオペアを作成し、フォトメトリックの一貫性を利用して、「疑似ステレオスーパービジョン」と呼ばれる教師なし制約を確立します。さらに、効率的な教師なし全方向 MVS ネットワークである Un-OmniMVS を提案し、2 つの効率的なコンポーネントで推論速度を促進します。まず、周波数注意を備えた新しい特徴抽出器が提案され、非局所フーリエ特徴と局所空間特徴を同時にキャプチャし、特徴表現を明示的に促進します。次に、計算の複雑さを軽減するために、分散ベースのライトコストボリュームを提唱します。実験は、教師なしソリューションのパフォーマンスが、現実世界のデータでより優れた一般化を備えた最先端の (SoTA) 教師ありメソッドのパフォーマンスに匹敵することを示しています。

Omnidirectional multi-view stereo (MVS) vision is attractive for its ultra-wide field-of-view (FoV), enabling machines to perceive 360° 3D surroundings. However, the existing solutions require expensive dense depth labels for supervision, making them impractical in real-world applications. In this paper, we propose the first unsupervised omnidirectional MVS framework based on multiple fisheye images. To this end, we project all images to a virtual view center and composite two panoramic images with spherical geometry from two pairs of back-to-back fisheye images. The two 360° images formulate a stereo pair with a special pose, and the photometric consistency is leveraged to establish the unsupervised constraint, which we term "Pseudo-Stereo Supervision". In addition, we propose Un-OmniMVS, an efficient unsupervised omnidirectional MVS network, to facilitate the inference speed with two efficient components. First, a novel feature extractor with frequency attention is proposed to simultaneously capture the non-local Fourier features and local spatial features, explicitly facilitating the feature representation. Then, a variance-based light cost volume is put forward to reduce the computational complexity. Experiments exhibit that the performance of our unsupervised solution is competitive to that of the state-of-the-art (SoTA) supervised methods with better generalization in real-world data.

updated: Wed Feb 22 2023 08:51:08 GMT+0000 (UTC)

published: Mon Feb 20 2023 11:35:55 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト