Neural Congealing: Aligning Images to a Joint Semantic Atlas

Dolev Ofri-Amar; Michal Geyer; Yoni Kasten; Tali Dekel

ニューラル・コンジーリング: 画像をジョイント・セマンティック・アトラスに合わせる

Neural Congealing を提示します。これは、特定の一連の画像全体で意味的に共通するコンテンツを検出し、共同で調整するための、ゼロショットの自己教師ありフレームワークです。私たちのアプローチは、事前にトレーニングされた DINO-ViT 機能の力を利用して学習します。入力画像のそれぞれに統一されたアトラス。画像セットごとのアトラス表現とマッピングを最適化し、追加の入力情報 (セグメンテーションマスクなど) を必要とせずに入力としていくつかの実世界の画像のみを必要とする、新しい堅牢な自己教師ありフレームワークを導き出します。特に、損失とトレーニングのパラダイムは、外観、ポーズ、背景の乱雑さ、またはその他の気を散らすオブジェクトの深刻な変化の下で共有されたコンテンツのみを説明するように設計されています。混合ドメインのセット (例: 彫刻と猫のアートワークを描いた画像の整列)、関連しているが異なるオブジェクトカテゴリ (例: 犬とトラ) を描いたセット、または大規模なトレーニングデータが不足している (例: コーヒーマグカップ)。私たちの方法を徹底的に評価し、大規模なデータセットで広範なトレーニングを必要とする最先端の方法と比較して、テスト時間の最適化アプローチが有利に機能することを示します。

We present Neural Congealing -- a zero-shot self-supervised framework for detecting and jointly aligning semantically-common content across a given set of images. Our approach harnesses the power of pre-trained DINO-ViT features to learn: (i) a joint semantic atlas -- a 2D grid that captures the mode of DINO-ViT features in the input set, and (ii) dense mappings from the unified atlas to each of the input images. We derive a new robust self-supervised framework that optimizes the atlas representation and mappings per image set, requiring only a few real-world images as input without any additional input information (e.g., segmentation masks). Notably, we design our losses and training paradigm to account only for the shared content under severe variations in appearance, pose, background clutter or other distracting objects. We demonstrate results on a plethora of challenging image sets including sets of mixed domains (e.g., aligning images depicting sculpture and artwork of cats), sets depicting related yet different object categories (e.g., dogs and tigers), or domains for which large-scale training data is scarce (e.g., coffee mugs). We thoroughly evaluate our method and show that our test-time optimization approach performs favorably compared to a state-of-the-art method that requires extensive training on large-scale datasets.

updated: Mon Mar 06 2023 18:12:46 GMT+0000 (UTC)

published: Wed Feb 08 2023 09:26:22 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト