GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts

Haoran Geng; Helin Xu; Chengyang Zhao; Chao Xu; Li Yi; Siyuan Huang; He Wang

GAPartNet: 一般化可能で実行可能なパーツによるクロスカテゴリドメインで一般化可能なオブジェクトの認識と操作

何年もの間、研究者は一般化可能なオブジェクトの認識と操作に専念してきました。そこでは、クロスカテゴリの一般化可能性が非常に望まれていますが、十分に調査されていません。この作業では、Generalizable and Actionable Parts (GAParts) を介して、このようなクロスカテゴリのスキルを学習することを提案します。 27 のオブジェクトカテゴリで 9 つの GAPart クラス (蓋、ハンドルなど) を特定して定義することにより、大規模なパーツ中心の対話型データセット GAPartNet を構築し、そこで 8,489 の豊富なパーツレベルの注釈 (セマンティクス、ポーズ) を提供します。 1,166 個のオブジェクトのパーツインスタンス。 GAPartNet に基づいて、パーツセグメンテーション、パーツポーズ推定、パーツベースのオブジェクト操作の 3 つのクロスカテゴリタスクを調査します。目に見えるオブジェクトカテゴリと目に見えないオブジェクトカテゴリの間には大きなドメインギャップがあるため、敵対的学習手法を統合することにより、ドメインの一般化の観点から堅牢な 3D セグメンテーション方法を提案します。私たちの方法は、目に見えるかどうかに関係なく、既存のすべての方法よりも大幅に優れています。さらに、パーツセグメンテーションとポーズ推定の結果を使用して、GAPart ポーズ定義を活用して、シミュレーターと現実世界の両方で目に見えないオブジェクトカテゴリに適切に一般化できるパーツベースの操作ヒューリスティックを設計します。私たちのデータセット、コード、およびデモは、プロジェクトページで入手できます。

For years, researchers have been devoted to generalizable object perception and manipulation, where cross-category generalizability is highly desired yet underexplored. In this work, we propose to learn such cross-category skills via Generalizable and Actionable Parts (GAParts). By identifying and defining 9 GAPart classes (lids, handles, etc.) in 27 object categories, we construct a large-scale part-centric interactive dataset, GAPartNet, where we provide rich, part-level annotations (semantics, poses) for 8,489 part instances on 1,166 objects. Based on GAPartNet, we investigate three cross-category tasks: part segmentation, part pose estimation, and part-based object manipulation. Given the significant domain gaps between seen and unseen object categories, we propose a robust 3D segmentation method from the perspective of domain generalization by integrating adversarial learning techniques. Our method outperforms all existing methods by a large margin, no matter on seen or unseen categories. Furthermore, with part segmentation and pose estimation results, we leverage the GAPart pose definition to design part-based manipulation heuristics that can generalize well to unseen object categories in both the simulator and the real world. Our dataset, code, and demos are available on our project page.

updated: Sun Mar 26 2023 23:59:07 GMT+0000 (UTC)

published: Thu Nov 10 2022 00:30:22 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト