Category-Level 6D Object Pose and Size Estimation using Self-Supervised Deep Prior Deformation Networks

Jiehong Lin; Zewei Wei; Changxing Ding; Kui Jia

自己教師あり深部事前変形ネットワークを使用したカテゴリレベルの6Dオブジェクトのポーズとサイズの推定

3D空間でオブジェクトインスタンスとそのセマンティクスに正確に注釈を付けることは困難であるため、カテゴリレベルの6Dオブジェクトのポーズやサイズの推定など、これらのタスクには合成データが広く使用されています。ただし、合成ドメインでの簡単な注釈は、合成から実数（Sim2Real）へのドメインギャップの悪影響をもたらします。この作業では、カテゴリレベルの6Dオブジェクトのポーズとサイズの推定のための教師なしドメイン適応であるSim2Realのタスク設定でこの問題に対処することを目指しています。我々は、DPDNと短縮された新しいDeep PriorDeformationNetworkに基づいて構築された方法を提案します。 DPDNは、オブジェクトの観測の特徴と一致するようにカテゴリ形状の特徴を変形することを学習します。したがって、オブジェクトのポーズとサイズを直接回帰するために、特徴空間で深い対応を確立できます。 Sim2Realドメインのギャップを減らすために、一貫性学習を介してDPDNに新しい自己教師あり目標を策定します。より具体的には、2つの厳密な変換を各オブジェクトの観測に並行して適用し、それらをそれぞれDPDNにフィードして、2セットの予測を生成します。並列学習に加えて、相互整合性の用語を使用して、変化をもたらすDPDNの感度を向上させるために、二重予測間の相互整合性を維持します。一方、個々の整合性内の用語は、各学習自体の中で自己適応を強制するために使用されます。合成CAMERA25と実際のREAL275データセットの両方のトレーニングセットでDPDNをトレーニングします。私たちの結果は、教師なし設定と教師あり設定の両方で、REAL275テストセットの既存の方法よりも優れています。アブレーション研究はまた、私たちのデザインの有効性を検証します。私たちのコードはhttps://github.com/JiehongLin/Self-DPDNで公開されています。

It is difficult to precisely annotate object instances and their semantics in 3D space, and as such, synthetic data are extensively used for these tasks, e.g., category-level 6D object pose and size estimation. However, the easy annotations in synthetic domains bring the downside effect of synthetic-to-real (Sim2Real) domain gap. In this work, we aim to address this issue in the task setting of Sim2Real, unsupervised domain adaptation for category-level 6D object pose and size estimation. We propose a method that is built upon a novel Deep Prior Deformation Network, shortened as DPDN. DPDN learns to deform features of categorical shape priors to match those of object observations, and is thus able to establish deep correspondence in the feature space for direct regression of object poses and sizes. To reduce the Sim2Real domain gap, we formulate a novel self-supervised objective upon DPDN via consistency learning; more specifically, we apply two rigid transformations to each object observation in parallel, and feed them into DPDN respectively to yield dual sets of predictions; on top of the parallel learning, an inter-consistency term is employed to keep cross consistency between dual predictions for improving the sensitivity of DPDN to pose changes, while individual intra-consistency ones are used to enforce self-adaptation within each learning itself. We train DPDN on both training sets of the synthetic CAMERA25 and real-world REAL275 datasets; our results outperform the existing methods on REAL275 test set under both the unsupervised and supervised settings. Ablation studies also verify the efficacy of our designs. Our code is released publicly at https://github.com/JiehongLin/Self-DPDN.

updated: Tue Jul 12 2022 10:24:52 GMT+0000 (UTC)

published: Tue Jul 12 2022 10:24:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト