Visual Boundary Knowledge Translation for Foreground Segmentation

Zunlei Feng; Lechao Cheng; Xinchao Wang; Xiang Wang; Yajie Liu; Xiangtong Du; Mingli Song

前景セグメンテーションのための視覚的境界知識翻訳

画像内の未知のタイプのオブジェクトに直面した場合、人間は簡単かつ正確に視覚的な境界を伝えることができます。この認識メカニズムと基礎となる一般化機能は、大規模なカテゴリ対応の注釈付きトレーニングサンプルに依存する最先端の画像セグメンテーションネットワークとは対照的であるように思われます。この論文では、目に見えないカテゴリをセグメント化するためのトレーニングの労力を減らすことを期待して、視覚的な境界知識を明示的に説明するモデルの構築に向けて試みます。具体的には、境界知識翻訳（BKT）と呼ばれる新しいタスクを調査します。完全にラベル付けされたカテゴリのセットが与えられると、BKTは、ラベル付けされたカテゴリから学習した視覚的境界の知識を、それぞれが少数のラベル付けされたサンプルのみを提供する新しいカテゴリのセットに変換することを目的としています。この目的のために、我々は、セグメンテーションネットワークと2つの境界弁別器からなる翻訳セグメンテーションネットワーク（Trans-Net）を提案します。セグメンテーションネットワークは、境界を意識した自己監視メカニズムと組み合わせて、前景セグメンテーションを実行するように考案されています。一方、2つの弁別器は、光の監視下で新しいカテゴリの正確なセグメンテーションを保証するために敵対的な方法で連携します。徹底的な実験は、ガイダンスとして数十のラベル付けされたサンプルのみで、Trans-Netが完全に監視された方法と同等の近い結果を達成することを示しています。

When confronted with objects of unknown types in an image, humans can effortlessly and precisely tell their visual boundaries. This recognition mechanism and underlying generalization capability seem to contrast to state-of-the-art image segmentation networks that rely on large-scale category-aware annotated training samples. In this paper, we make an attempt towards building models that explicitly account for visual boundary knowledge, in hope to reduce the training effort on segmenting unseen categories. Specifically, we investigate a new task termed as Boundary Knowledge Translation (BKT). Given a set of fully labeled categories, BKT aims to translate the visual boundary knowledge learned from the labeled categories, to a set of novel categories, each of which is provided only a few labeled samples. To this end, we propose a Translation Segmentation Network (Trans-Net), which comprises a segmentation network and two boundary discriminators. The segmentation network, combined with a boundary-aware self-supervised mechanism, is devised to conduct foreground segmentation, while the two discriminators work together in an adversarial manner to ensure an accurate segmentation of the novel categories under light supervision. Exhaustive experiments demonstrate that, with only tens of labeled samples as guidance, Trans-Net achieves close results on par with fully supervised methods.

updated: Sun Aug 01 2021 07:10:25 GMT+0000 (UTC)

published: Sun Aug 01 2021 07:10:25 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト