Vote from the Center: 6 DoF Pose Estimation in RGB-D Images by Radial Keypoint Voting

Yangzheng Wu; Mohsen Zand; Ali Etemad; Michael Greenspan

センターからの投票：ラジアルキーポイント投票によるRGB-D画像の6DoFポーズ推定

交差する球に基づく新しいキーポイント投票スキームを提案します。これは、既存のスキームよりも正確であり、より少ない、より分散したキーポイントを可能にします。このスキームは、ポイント間の距離に基づいています。これは、1D量が、前の作業で回帰された2Dおよび3Dベクトルおよびオフセット量よりも正確に回帰できるため、より正確なキーポイントのローカリゼーションが得られます。このスキームは、RGB-Dデータ内の3Dオブジェクトの6 DoFポーズ推定のために提案されたRCVPoseメソッドの基礎を形成します。これは、オクルージョンの処理に特に効果的です。 CNNは、各RGBピクセルの深度モードに対応する3Dポイントと、オブジェクトフレームで定義された3つの分散キーポイントのセットとの間の距離を推定するようにトレーニングされています。推論時に、この推定距離に等しい半径の、各3Dポイントを中心とする球が生成されます。これらの球の表面は、3Dアキュムレータ空間をインクリメントするために投票し、そのピークはキーポイントの位置を示します。提案された放射状投票スキームは、以前のベクトルまたはオフセットスキームよりも正確であり、キーポイントを分散するために堅牢です。実験は、RCVPoseが非常に正確で競争力があり、LINEMOD 99.7％およびYCB-Video 97.2％データセットで最先端の結果を達成し、特に、挑戦的なOcclusionLINEMODデータセットで以前の方法よりも+4.9％高い71.1％のスコアを示しています。そして、これら3つのデータセットのBOPベンチマークから公開された他のすべての結果を平均して上回っています。私たちのコードはhttp://www.github.com/aaronwool/rcvposeで入手できます。

We propose a novel keypoint voting scheme based on intersecting spheres, that is more accurate than existing schemes and allows for fewer, more disperse keypoints. The scheme is based upon the distance between points, which as a 1D quantity can be regressed more accurately than the 2D and 3D vector and offset quantities regressed in previous work, yielding more accurate keypoint localization. The scheme forms the basis of the proposed RCVPose method for 6 DoF pose estimation of 3D objects in RGB-D data, which is particularly effective at handling occlusions. A CNN is trained to estimate the distance between the 3D point corresponding to the depth mode of each RGB pixel, and a set of 3 disperse keypoints defined in the object frame. At inference, a sphere centered at each 3D point is generated, of radius equal to this estimated distance. The surfaces of these spheres vote to increment a 3D accumulator space, the peaks of which indicate keypoint locations. The proposed radial voting scheme is more accurate than previous vector or offset schemes, and is robust to disperse keypoints. Experiments demonstrate RCVPose to be highly accurate and competitive, achieving state-of-the-art results on the LINEMOD 99.7% and YCB-Video 97.2% datasets, notably scoring +4.9% higher 71.1% than previous methods on the challenging Occlusion LINEMOD dataset, and on average outperforming all other published results from the BOP benchmark for these 3 datasets. Our code is available at http://www.github.com/aaronwool/rcvpose.

updated: Tue Jul 12 2022 23:50:22 GMT+0000 (UTC)

published: Tue Apr 06 2021 14:06:08 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト