SeasonDepth: Cross-Season Monocular Depth Prediction Dataset and Benchmark under Multiple Environments

Hanjiang Hu; Baoquan Yang; Zhijian Qiao; Ding Zhao; Hesheng Wang

SeasonDepth: 複数の環境でのクロスシーズン単眼深度予測データセットとベンチマーク

環境の変化は、屋外の視覚認識と堅牢な長期的な自律走行およびモバイルロボットのシーンの理解に大きな課題をもたらします。そこでは、深度補助の幾何学的情報が困難なシーンでの堅牢性に不可欠な役割を果たします。単眼の深度予測は最近よく研究されていますが、そのような実世界のデータセットとベンチマークが不足しているため、照明や季節の変化など、複数の環境条件にわたる深度予測に焦点を当てた研究はほとんどありません。この作業では、新しい季節間の単眼深度予測データセット SeasonDepth (https://seasondepth.github.io で入手可能) が、動きからの構造を通じて CMU Visual Localization データセットから派生します。さまざまな環境下で深度推定パフォーマンスをベンチマークするために、いくつかの新しく策定されたメトリックを使用して、KITTI ベンチマークから代表的かつ最近の最先端のオープンソースの教師あり、自己監視、およびドメイン適応深度予測方法を調査します。提案されたデータセットに対する微調整なしの広範な実験的評価を通じて、パフォーマンスとロバスト性に対する複数の環境の影響が定性的および定量的に分析され、長期的な単眼深度の予測が解決されていないことを示しています。さらに、特にステレオジオメトリとマルチタスクシーケンシャル自己教師ありトレーニングを使用して、変化する環境に対する堅牢性を強化する有望なソリューションを提供します。

Changing environments poses a great challenge on the outdoor visual perception and scene understanding for robust long-term autonomous driving and mobile robots, where depth-auxiliary geometric information plays an essential role to the robustness under challenging scenes. Although monocular depth prediction has been well studied recently, there are few work focusing on the depth prediction across multiple environmental conditions, e.g. changing illumination and seasons, owing to the lack of such a real-world dataset and benchmark. In this work, a new cross-season monocular depth prediction dataset SeasonDepth (available on https://seasondepth.github.io) is derived from CMU Visual Localization dataset through structure from motion. To benchmark the depth estimation performance under different environments, we investigate representative and recent state-of-the-art open-source supervised, self-supervised and domain adaptation depth prediction methods from KITTI benchmark using several newly-formulated metrics. Through extensive experimental evaluation on the proposed dataset without fine-tuning, the influence of multiple environments on performance and robustness is analyzed both qualitatively and quantitatively, showing that the long-term monocular depth prediction is far from solved. We further give promising solutions especially with stereo geometry and multi-task sequential self-supervised training to enhance the robustness to changing environments.

updated: Tue Jun 08 2021 14:35:07 GMT+0000 (UTC)

published: Mon Nov 09 2020 13:24:45 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト