Unsupervised Segmentation in Real-World Images via Spelke Object Inference

Honglin Chen; Rahul Venkatesh; Yoni Friedman; Jiajun Wu; Joshua B. Tenenbaum; Daniel L. K. Yamins; Daniel M. Bear

Spelkeオブジェクト推論による実世界画像の教師なしセグメンテーション

実世界の画像の自己監視されたカテゴリにとらわれないセグメンテーションは、コンピュータビジョンにおける挑戦的な開かれた問題です。ここでは、Spelke Objectの認知科学の概念（一緒に動く一連の物理的なもの）に基づいて、モーションの自己監視から静的なグループ化の事前情報を学習する方法を示します。興奮性-抑制性セグメント抽出ネットワーク（EISEN）を紹介します。これは、モーションベースのトレーニング信号から静的シーンのペアワイズアフィニティグラフを抽出することを学習します。次に、EISENは、新しいグラフ伝播および競合ネットワークを使用して、アフィニティからセグメントを生成します。トレーニング中、相関モーションを実行するオブジェクト（ロボットアームとそれらが移動するオブジェクトなど）は、ブートストラッププロセスによって分離されます。EISENは、セグメント化することをすでに学習したオブジェクトのモーションを説明します。 EISENが、挑戦的な合成および実世界のロボットデータセットでの自己監視画像セグメンテーションの最先端技術を大幅に改善することを示します。

Self-supervised, category-agnostic segmentation of real-world images is a challenging open problem in computer vision. Here, we show how to learn static grouping priors from motion self-supervision by building on the cognitive science concept of a Spelke Object: a set of physical stuff that moves together. We introduce the Excitatory-Inhibitory Segment Extraction Network (EISEN), which learns to extract pairwise affinity graphs for static scenes from motion-based training signals. EISEN then produces segments from affinities using a novel graph propagation and competition network. During training, objects that undergo correlated motion (such as robot arms and the objects they move) are decoupled by a bootstrapping process: EISEN explains away the motion of objects it has already learned to segment. We show that EISEN achieves a substantial improvement in the state of the art for self-supervised image segmentation on challenging synthetic and real-world robotics datasets.

updated: Mon Jul 25 2022 16:24:49 GMT+0000 (UTC)

published: Tue May 17 2022 17:39:24 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト