SO(3)-Pose: SO(3)-Equivariance Learning for 6D Object Pose Estimation

Haoran Pan; Jun Zhou; Yuanpeng Liu; Xuequan Lu; Weiming Wang; Xuefeng Yan; Mingqiang Wei

SO(3)-Pose: 6D オブジェクトの姿勢推定のための SO(3)-Equivariance Learning

RGB-D 画像からの剛体の 6D 姿勢推定は、ロボット工学における物体の把握と操作に不可欠です。 RGB チャネルと深度 (D) チャネルはしばしば補完的であり、それぞれ外観とジオメトリ情報を提供しますが、2 つのクロスモーダルデータから完全に利益を得る方法は依然として自明ではありません。単純だが新しい観察から、オブジェクトが回転するとき、そのセマンティックラベルはポーズに対して不変ですが、そのキーポイントオフセット方向はポーズに対して可変です。この目的のために、ポーズ推定のために深度チャネルから SO(3) 等変および SO(3) 不変の特徴を探索する新しい表現学習ネットワークである SO(3)-Pose を提示します。 SO(3) 不変機能は、RGB チャネルから類似の外観を持つオブジェクトをセグメント化するためのより特徴的な表現を学習することを容易にします。 SO(3)-equivariant 機能は、RGB 機能と通信して、深度チャネルから反射面を持つオブジェクトのキーポイントを検出するための (欠落した) ジオメトリを推測します。既存のほとんどの姿勢推定方法とは異なり、当社の SO(3)-Pose は、RGB チャネルと深度チャネル間の情報通信を実装するだけでなく、深度画像から SO(3) 等分散ジオメトリの知識を自然に吸収し、外観と外観を向上させます。幾何表現学習。包括的な実験は、私たちの方法が3つのベンチマークで最先端のパフォーマンスを達成することを示しています.

6D pose estimation of rigid objects from RGB-D images is crucial for object grasping and manipulation in robotics. Although RGB channels and the depth (D) channel are often complementary, providing respectively the appearance and geometry information, it is still non-trivial how to fully benefit from the two cross-modal data. From the simple yet new observation, when an object rotates, its semantic label is invariant to the pose while its keypoint offset direction is variant to the pose. To this end, we present SO(3)-Pose, a new representation learning network to explore SO(3)-equivariant and SO(3)-invariant features from the depth channel for pose estimation. The SO(3)-invariant features facilitate to learn more distinctive representations for segmenting objects with similar appearance from RGB channels. The SO(3)-equivariant features communicate with RGB features to deduce the (missed) geometry for detecting keypoints of an object with the reflective surface from the depth channel. Unlike most of existing pose estimation methods, our SO(3)-Pose not only implements the information communication between the RGB and depth channels, but also naturally absorbs the SO(3)-equivariance geometry knowledge from depth images, leading to better appearance and geometry representation learning. Comprehensive experiments show that our method achieves the state-of-the-art performance on three benchmarks.

updated: Wed Aug 17 2022 15:04:47 GMT+0000 (UTC)

published: Wed Aug 17 2022 15:04:47 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト