Single Image Depth Prediction with Wavelet Decomposition

Michaël Ramamonjisoa; Michael Firman; Jamie Watson; Vincent Lepetit; Daniyar Turmukhambetov

ウェーブレット分解による単一画像の深さ予測

単眼画像から高効率で正確な深度を予測するための新しい方法を提示します。この最適な効率は、完全に微分可能なエンコーダ-デコーダアーキテクチャに統合されているウェーブレット分解を活用することで実現されます。スパースウェーブレット係数を予測することにより、忠実度の高い深度マップを再構築できることを示します。以前の作品とは対照的に、ウェーブレット係数は係数を直接監視しなくても学習できることを示しています。代わりに、逆ウェーブレット変換によって再構成された最終的な深度画像のみを監視します。さらに、ウェーブレット係数は、グラウンドトゥルース深度にアクセスすることなく、完全に自己監視されたシナリオで学習できることを示します。最後に、この方法をさまざまな最先端の単眼深度推定モデルに適用します。いずれの場合も、デコーダーネットワークで必要な乗算の半分未満で、元のモデルと同等またはそれ以上の結果が得られます。 https://github.com/nianticlabs/wavelet-monodepthのコード

We present a novel method for predicting accurate depths from monocular images with high efficiency. This optimal efficiency is achieved by exploiting wavelet decomposition, which is integrated in a fully differentiable encoder-decoder architecture. We demonstrate that we can reconstruct high-fidelity depth maps by predicting sparse wavelet coefficients. In contrast with previous works, we show that wavelet coefficients can be learned without direct supervision on coefficients. Instead we supervise only the final depth image that is reconstructed through the inverse wavelet transform. We additionally show that wavelet coefficients can be learned in fully self-supervised scenarios, without access to ground-truth depth. Finally, we apply our method to different state-of-the-art monocular depth estimation models, in each case giving similar or better results compared to the original model, while requiring less than half the multiply-adds in the decoder network. Code at https://github.com/nianticlabs/wavelet-monodepth

updated: Mon Aug 16 2021 12:11:38 GMT+0000 (UTC)

published: Thu Jun 03 2021 17:42:25 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト