TransNet: Category-Level Transparent Object Pose Estimation

Huijie Zhang; Anthony Opipari; Xiaotong Chen; Jiyue Zhu; Zeren Yu; Odest Chadwicke Jenkins

TransNet: カテゴリレベルの透明オブジェクトの姿勢推定

透明なオブジェクトは、視覚認識システムに複数の異なる課題を提示します。第一に、視覚的特徴を区別できないため、透明なオブジェクトは不透明なオブジェクトよりも検出とローカライズが難しくなります。人間でさえ、鏡面反射や屈折がほとんどない特定の透明な表面 (ガラスのドアなど) を認識するのは困難です。 2 つ目の課題は、不透明なオブジェクトの認識に通常使用される一般的な深度センサーは、透明なオブジェクトの固有の反射特性により、正確な深度測定値を取得できないことです。これらの課題から、同じカテゴリ (カップなど) 内の透明なオブジェクトインスタンスは、同じカテゴリの通常の不透明なオブジェクトよりも互いに似ていることがわかります。この観察を考慮して、本論文では、インスタンスレベルの姿勢推定ではなく、カテゴリレベルの透明オブジェクトの姿勢推定の可能性を探ることに着手します。ローカライズされた深度補完と表面法線推定を使用して、カテゴリレベルの透明なオブジェクトの姿勢を推定することを学習する 2 段階のパイプラインである TransNet を提案します。 TransNet は、最近の大規模な透明オブジェクトデータセットでのポーズ推定精度の観点から評価され、最先端のカテゴリレベルのポーズ推定アプローチと比較されます。この比較の結果は、TransNet が透明なオブジェクトの姿勢推定精度を向上させることを示しており、含まれているアブレーション研究からの重要な発見は、パフォーマンス改善の将来の方向性を示唆しています。

Transparent objects present multiple distinct challenges to visual perception systems. First, their lack of distinguishing visual features makes transparent objects harder to detect and localize than opaque objects. Even humans find certain transparent surfaces with little specular reflection or refraction, e.g. glass doors, difficult to perceive. A second challenge is that common depth sensors typically used for opaque object perception cannot obtain accurate depth measurements on transparent objects due to their unique reflective properties. Stemming from these challenges, we observe that transparent object instances within the same category (e.g. cups) look more similar to each other than to ordinary opaque objects of that same category. Given this observation, the present paper sets out to explore the possibility of category-level transparent object pose estimation rather than instance-level pose estimation. We propose TransNet, a two-stage pipeline that learns to estimate category-level transparent object pose using localized depth completion and surface normal estimation. TransNet is evaluated in terms of pose estimation accuracy on a recent, large-scale transparent object dataset and compared to a state-of-the-art category-level pose estimation approach. Results from this comparison demonstrate that TransNet achieves improved pose estimation accuracy on transparent objects and key findings from the included ablation studies suggest future directions for performance improvements.

updated: Mon Aug 22 2022 01:34:31 GMT+0000 (UTC)

published: Mon Aug 22 2022 01:34:31 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト