AdaBins: Depth Estimation using Adaptive Bins

Shariq Farooq Bhat; Ibraheem Alhashim; Peter Wonka

AdaBins: 適応ビンを用いた深さ推定

我々は、単一のRGB入力画像から高品質の密な深度マップを推定する問題に取り組んでいる。ベースラインのエンコーダ-デコーダ畳み込みニューラルネットワークアーキテクチャから始め、情報のグローバル処理が全体的な深度推定をどのように改善するのに役立つかという問題を提起する。この目的のために、深度範囲を画像ごとに中心値が適応的に推定されるビンに分割するトランスフォーマーベースのアーキテクチャブロックを提案する。最終的な深度値は、ビンの中心値の線形組み合わせとして推定される。我々はこの新しいアーキテクチャブロックをAdaBinsと呼ぶ。我々の結果は、すべてのメトリクスにおいて、いくつかの一般的な深度データセットで、最先端の手法よりも決定的な改善が見られることを示している。また、提案したブロックの有効性をアブレーション研究で検証し、新しい最先端モデルのコードと対応する事前学習済み重みを提供する。

We address the problem of estimating a high quality dense depth map from a single RGB input image. We start out with a baseline encoder-decoder convolutional neural network architecture and pose the question of how the global processing of information can help improve overall depth estimation. To this end, we propose a transformer-based architecture block that divides the depth range into bins whose center value is estimated adaptively per image. The final depth values are estimated as linear combinations of the bin centers. We call our new building block AdaBins. Our results show a decisive improvement over the state-of-the-art on several popular depth datasets across all metrics. We also validate the effectiveness of the proposed block with an ablation study and provide the code and corresponding pre-trained weights of the new state-of-the-art model.

updated: Sat Nov 28 2020 14:40:45 GMT+0000 (UTC)

published: Sat Nov 28 2020 14:40:45 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト