Self-Supervised Learning of Depth and Ego-Motion from Video by Alternative Training and Geometric Constraints from 3D to 2D

Jiaojiao Fang; Guizhong Liu

代替トレーニングと3Dから2Dへの幾何学的制約によるビデオからの深さと自我運動の自己教師あり学習

ラベルのない単眼ビデオからの深さと自我運動の自己教師あり学習は、有望な結果を獲得し、幅広い注目を集めています。ほとんどの既存の方法は、運動からの構造（SFM）の原理に基づいて、隣接するフレームの測光の一貫性によって深度とポーズのネットワークを共同でトレーニングします。ただし、深度ネットワークとポーズネットワークの結合関係は学習パフォーマンスに深刻な影響を及ぼし、再投影関係は、特にポーズ学習の場合、スケールのあいまいさに敏感です。この論文では、補助タスクなしで深度ポーズ学習パフォーマンスを改善し、各タスクを代替トレーニングし、エピポーラ幾何学的制約を反復最接近点（ICP）ベースの点群一致プロセスに組み込むことによって上記の問題に対処することを目指しています。深度ネットワークとポーズネットワークを共同でトレーニングするのとは異なり、私たちの重要なアイデアは、各ネットワークをそれぞれの損失で交互にトレーニングし、他方を修正することによって、これら2つのタスクの相互依存性をより有効に活用することです。また、トレーニング中のより小さな深度値に重点を置くために、対数スケールの3D構造整合性損失を設計します。最適化を容易にするために、エピポーラ幾何学をポーズ学習のためのICPベースの学習プロセスにさらに組み込みます。さまざまなベンチマークデータセットでの広範な実験は、最先端の自己監視方式に対するアルゴリズムの優位性を示しています。

Self-supervised learning of depth and ego-motion from unlabeled monocular video has acquired promising results and drawn extensive attention. Most existing methods jointly train the depth and pose networks by photometric consistency of adjacent frames based on the principle of structure-from-motion (SFM). However, the coupling relationship of the depth and pose networks seriously influences the learning performance, and the re-projection relations is sensitive to scale ambiguity, especially for pose learning. In this paper, we aim to improve the depth-pose learning performance without the auxiliary tasks and address the above issues by alternative training each task and incorporating the epipolar geometric constraints into the Iterative Closest Point (ICP) based point clouds match process. Distinct from jointly training the depth and pose networks, our key idea is to better utilize the mutual dependency of these two tasks by alternatively training each network with respective losses while fixing the other. We also design a log-scale 3D structural consistency loss to put more emphasis on the smaller depth values during training. To makes the optimization easier, we further incorporate the epipolar geometry into the ICP based learning process for pose learning. Extensive experiments on various benchmarks datasets indicate the superiority of our algorithm over the state-of-the-art self-supervised methods.

updated: Wed Aug 04 2021 11:40:53 GMT+0000 (UTC)

published: Wed Aug 04 2021 11:40:53 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト