SATR: Zero-Shot Semantic Segmentation of 3D Shapes

Ahmed Abdelreheem; Ivan Skorokhodov; Maks Ovsjanikov; Peter Wonka

SATR: 3D 形状のゼロショットセマンティックセグメンテーション

私たちは、既製の大規模 2D 画像認識モデルを使用して、3D 形状のゼロショットセマンティックセグメンテーションのタスクを調査します。驚くべきことに、現代のテキスト/画像類似性予測器やゼロショット 2D セグメンテーションネットワークよりも、最新のゼロショット 2D オブジェクト検出器の方がこのタスクに適していることがわかりました。私たちの重要な発見は、基礎となるサーフェスのトポロジ特性を使用することで、マルチビューのバウンディングボックス予測から正確な 3D セグメンテーションマップを抽出できるということです。このために、トポロジカル再重み付けによるセグメンテーション割り当て (SATR) アルゴリズムを開発し、それを ShapeNetPart および提案した FAUST ベンチマークで評価します。 SATR は最先端のパフォーマンスを実現し、FAUST の粗いベンチマークと細かいベンチマークではそれぞれ 1.3% と 4% の平均 mIoU、ShapeNetPart ベンチマークでは 5.2% の平均 mIoU でベースラインアルゴリズムを上回ります。私たちのソースコードとデータは一般に公開されます。プロジェクトのウェブページ: https://samir55.github.io/SATR/。

We explore the task of zero-shot semantic segmentation of 3D shapes by using large-scale off-the-shelf 2D image recognition models. Surprisingly, we find that modern zero-shot 2D object detectors are better suited for this task than contemporary text/image similarity predictors or even zero-shot 2D segmentation networks. Our key finding is that it is possible to extract accurate 3D segmentation maps from multi-view bounding box predictions by using the topological properties of the underlying surface. For this, we develop the Segmentation Assignment with Topological Reweighting (SATR) algorithm and evaluate it on ShapeNetPart and our proposed FAUST benchmarks. SATR achieves state-of-the-art performance and outperforms a baseline algorithm by 1.3% and 4% average mIoU on the FAUST coarse and fine-grained benchmarks, respectively, and by 5.2% average mIoU on the ShapeNetPart benchmark. Our source code and data will be publicly released. Project webpage: https://samir55.github.io/SATR/.

updated: Mon Aug 21 2023 00:37:57 GMT+0000 (UTC)

published: Tue Apr 11 2023 00:43:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト