CrOC: Cross-View Online Clustering for Dense Visual Representation Learning

Thomas Stegmüller; Tim Lebailly; Behzad Bozorgtabar; Tinne Tuytelaars; Jean-Philippe Thiran

CrOC: 密な視覚表現学習のためのクロスビューオンラインクラスタリング

ラベルのない高密度の視覚的表現を学習することは骨の折れる作業であり、シーン中心のデータからはなおさらです。ビューのセマンティクスを発見してセグメント化するためのオンラインクラスタリングメカニズム (CrOC) を使用したクロスビューの一貫性の目標を提案することにより、この困難な問題に取り組むことを提案します。手作りの事前確率がない場合、結果として得られる方法はより一般化可能であり、面倒な前処理ステップを必要としません。さらに重要なことは、クラスタリングアルゴリズムが両方のビューの機能を組み合わせて動作することで、両方のビューに表示されないコンテンツの問題や、ある作物から別の作物へのオブジェクトのあいまいな一致の問題をエレガントに回避することです。さまざまなデータセットでの線形および教師なしのセグメンテーション転送タスクで優れたパフォーマンスを発揮し、ビデオオブジェクトのセグメンテーションでも同様です。私たちのコードと事前トレーニング済みのモデルは、https://github.com/stegmuel/CrOC で公開されています。

Learning dense visual representations without labels is an arduous task and more so from scene-centric data. We propose to tackle this challenging problem by proposing a Cross-view consistency objective with an Online Clustering mechanism (CrOC) to discover and segment the semantics of the views. In the absence of hand-crafted priors, the resulting method is more generalizable and does not require a cumbersome pre-processing step. More importantly, the clustering algorithm conjointly operates on the features of both views, thereby elegantly bypassing the issue of content not represented in both views and the ambiguous matching of objects from one crop to the other. We demonstrate excellent performance on linear and unsupervised segmentation transfer tasks on various datasets and similarly for video object segmentation. Our code and pre-trained models are publicly available at https://github.com/stegmuel/CrOC.

updated: Thu Mar 23 2023 13:24:16 GMT+0000 (UTC)

published: Thu Mar 23 2023 13:24:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト