Deep Projective Rotation Estimation through Relative Supervision

Brian Okorn; Chuer Pan; Martial Hebert; David Held

相対スーパービジョンによる深い射影回転推定

方向推定は、カメラやオブジェクトの姿勢推定など、さまざまな視覚およびロボット工学のタスクの中核です。ディープラーニングは、画像ベースの方向推定器を開発する方法を提供してきました。ただし、そのような推定器は、多くの場合、ラベル付けされた大規模なデータセットでトレーニングする必要があり、収集に時間がかかる場合があります。この作業では、ラベル付けされていないデータからの自己教師あり学習を使用して、この問題を軽減できるかどうかを調べます。具体的には、ローカルアライメントメソッドを介して取得できるような、隣接するポーズ間の相対的な向きの推定値へのアクセスを想定しています。自己教師あり学習は並進オブジェクトのキーポイントにうまく使用されていますが、この作業では、相対教師を回転群 SO(3) に単純に適用すると、回転空間の非凸性のために収束に失敗することが多いことを示します。この課題に取り組むために、修正ロドリゲスパラメータを利用して SO(3) の閉じた多様体を R^3 の開いた多様体に平射的に射影し、最適化を開いた状態で行うことを可能にする自己教師あり方向推定の新しいアルゴリズムを提案します。ユークリッド空間。 (1) 回転パラメーターの直接最適化、および (2) 画像からオブジェクトの向きを予測する畳み込みニューラルネットワークのパラメーターの最適化の 2 つの設定で、回転平均化問題に対する提案されたアルゴリズムの利点を経験的に検証します。両方の設定で、提案されたアルゴリズムが、純粋に SO(3) 空間で動作するアルゴリズムよりもはるかに高速に一貫した相対方向フレームに収束できることを示します。追加情報は https://sites.google.com/view/deep-projective-rotation/home にあります。

Orientation estimation is the core to a variety of vision and robotics tasks such as camera and object pose estimation. Deep learning has offered a way to develop image-based orientation estimators; however, such estimators often require training on a large labeled dataset, which can be time-intensive to collect. In this work, we explore whether self-supervised learning from unlabeled data can be used to alleviate this issue. Specifically, we assume access to estimates of the relative orientation between neighboring poses, such that can be obtained via a local alignment method. While self-supervised learning has been used successfully for translational object keypoints, in this work, we show that naively applying relative supervision to the rotational group SO(3) will often fail to converge due to the non-convexity of the rotational space. To tackle this challenge, we propose a new algorithm for self-supervised orientation estimation which utilizes Modified Rodrigues Parameters to stereographically project the closed manifold of SO(3) to the open manifold of R^3, allowing the optimization to be done in an open Euclidean space. We empirically validate the benefits of the proposed algorithm for rotational averaging problem in two settings: (1) direct optimization on rotation parameters, and (2) optimization of parameters of a convolutional neural network that predicts object orientations from images. In both settings, we demonstrate that our proposed algorithm is able to converge to a consistent relative orientation frame much faster than algorithms that purely operate in the SO(3) space. Additional information can be found at https://sites.google.com/view/deep-projective-rotation/home .

updated: Mon Nov 21 2022 04:58:07 GMT+0000 (UTC)

published: Mon Nov 21 2022 04:58:07 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト