CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

Yijia Weng; He Wang; Qiang Zhou; Yuzhe Qin; Yueqi Duan; Qingnan Fan; Baoquan Chen; Hao Su; Leonidas J. Guibas

CAPTRA：点群からの剛体および連結オブジェクトのカテゴリレベルのポーズ追跡

この作業では、点群シーケンスからのオブジェクトのカテゴリレベルのオンラインポーズ追跡の問題に取り組みます。初めて、新しい剛体インスタンスの9DoFポーズ追跡と、既知のカテゴリの関節オブジェクトのパーツごとのポーズ追跡を処理できる統合フレームワークを提案します。ここで、6Dポーズと3Dサイズで構成される9DoFポーズは、自由な6Dポーズを使用した3Dアモーダルバウンディングボックス表現と同等です。現在のフレームの深度点群と最後のフレームから推定されたポーズが与えられると、新しいエンドツーエンドのパイプラインはポーズを正確に更新することを学習します。私たちのパイプラインは、次の3つのモジュールで構成されています。1）入力深度点群のポーズを正規化するポーズ正規化モジュール。 2）RotationNet、小さなフレーム間デルタ回転を直接回帰するモジュール。 3）CoordinateNetは、正規化された座標とセグメンテーションを予測するモジュールであり、3Dサイズと平行移動の分析計算を可能にします。ポーズ正規化された点群の小さなポーズレジームを活用して、私たちの方法は、密な座標予測と直接回転回帰を組み合わせることにより、両方の長所を統合し、9DoFポーズ精度に最適化されたエンドツーエンドの微分可能なパイプラインを生成します（ -微分可能なRANSAC）。私たちの広範な実験は、私たちの方法が、カテゴリレベルの剛体ポーズ（NOCS-REAL275）および関節式オブジェクトポーズベンチマーク（SAPIEN、BMVC）で最速のFPS〜12で新しい最先端のパフォーマンスを達成することを示しています。

In this work, we tackle the problem of category-level online pose tracking of objects from point cloud sequences. For the first time, we propose a unified framework that can handle 9DoF pose tracking for novel rigid object instances as well as per-part pose tracking for articulated objects from known categories. Here the 9DoF pose, comprising 6D pose and 3D size, is equivalent to a 3D amodal bounding box representation with free 6D pose. Given the depth point cloud at the current frame and the estimated pose from the last frame, our novel end-to-end pipeline learns to accurately update the pose. Our pipeline is composed of three modules: 1) a pose canonicalization module that normalizes the pose of the input depth point cloud; 2) RotationNet, a module that directly regresses small interframe delta rotations; and 3) CoordinateNet, a module that predicts the normalized coordinates and segmentation, enabling analytical computation of the 3D size and translation. Leveraging the small pose regime in the pose-canonicalized point clouds, our method integrates the best of both worlds by combining dense coordinate prediction and direct rotation regression, thus yielding an end-to-end differentiable pipeline optimized for 9DoF pose accuracy (without using non-differentiable RANSAC). Our extensive experiments demonstrate that our method achieves new state-of-the-art performance on category-level rigid object pose (NOCS-REAL275) and articulated object pose benchmarks (SAPIEN, BMVC) at the fastest FPS ~12.

updated: Thu Oct 21 2021 09:49:46 GMT+0000 (UTC)

published: Thu Apr 08 2021 00:14:58 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト