360SD-Net: 360° Stereo Depth Estimation with Learnable Cost Volume

Ning-Hsu Wang; Bolivar Solarte; Yi-Hsuan Tsai; Wei-Chen Chiu; Min Sun

360SD-Net：360°ステレオ深度推定と学習可能なコストボリューム

最近、エンドツーエンドのトレーニング可能なディープニューラルネットワークにより、透視画像のステレオ深度推定が大幅に改善されました。ただし、正距円筒図法でキャプチャされた360°画像は、導入された歪みのために既存の方法を直接採用することから恩恵を受けることができません（つまり、3Dの線は2Dの線に投影されません）。この問題に取り組むために、上下360°カメラペアの設定を使用して球面視差用に特別に設計された新しいアーキテクチャを紹介します。さらに、（1）球面座標内の各ピクセルの位置と関係をキャプチャする追加の入力ブランチ、および（2）学習可能なシフトフィルター上に構築されたコストボリュームによって、歪みの問題を軽減することを提案します。 360°ステレオデータがないため、トレーニングと評価のためにMatterport3DとStanford3Dから2つの360°ステレオデータセットを収集します。既存のアルゴリズムに対して私たちの方法を検証するために、広範な実験とアブレーション研究が提供されています。最後に、2つの消費者レベルのカメラで画像をキャプチャする実際の環境で有望な結果を示します。

Recently, end-to-end trainable deep neural networks have significantly improved stereo depth estimation for perspective images. However, 360° images captured under equirectangular projection cannot benefit from directly adopting existing methods due to distortion introduced (i.e., lines in 3D are not projected onto lines in 2D). To tackle this issue, we present a novel architecture specifically designed for spherical disparity using the setting of top-bottom 360° camera pairs. Moreover, we propose to mitigate the distortion issue by (1) an additional input branch capturing the position and relation of each pixel in the spherical coordinate, and (2) a cost volume built upon a learnable shifting filter. Due to the lack of 360° stereo data, we collect two 360° stereo datasets from Matterport3D and Stanford3D for training and evaluation. Extensive experiments and ablation study are provided to validate our method against existing algorithms. Finally, we show promising results on real-world environments capturing images with two consumer-level cameras.

updated: Thu Mar 26 2020 15:51:54 GMT+0000 (UTC)

published: Mon Nov 11 2019 18:56:49 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト