APANet: Adaptive Prototypes Alignment Network for Few-Shot Semantic Segmentation

Jiacheng Chen; Bin-Bin Gao; Zongqing Lu; Jing-Hao Xue; Chengjie Wang; Qingmin Liao

APANet：少数ショットのセマンティックセグメンテーションのための適応プロトタイプアラインメントネットワーク

少数ショットのセマンティックセグメンテーションは、ラベル付けされたサポート画像をいくつか使用して、特定のクエリ画像内の新規クラスのオブジェクトをセグメント化することを目的としています。最も高度なソリューションは、各クエリ機能を学習したクラス固有のプロトタイプに一致させることでセグメンテーションを実行するメトリック学習フレームワークを活用します。ただし、このフレームワークは、機能の比較が不完全なため、分類に偏りがあります。この問題に対処するために、クラス固有およびクラスに依存しないプロトタイプを導入することによって適応プロトタイプ表現を提示し、クエリ機能とのセマンティックアラインメントを学習するための完全なサンプルペアを構築します。補完的な特徴学習方法は、特徴の比較を効果的に強化し、数ショットの設定で偏りのないセグメンテーションモデルを生成するのに役立ちます。これは、2つのブランチのエンドツーエンドネットワーク（つまり、クラス固有のブランチとクラスに依存しないブランチ）で実装され、プロトタイプを生成してから、クエリ機能を組み合わせて比較を実行します。さらに、提案されたクラスにとらわれないブランチはシンプルでありながら効果的です。実際には、クエリ画像のクラスに依存しない複数のプロトタイプを適応的に生成し、自己対照的な方法で特徴の配置を学習できます。 PASCAL-5 ^ iとCOCO-20 ^ iに関する広範な実験は、私たちの方法の優位性を示しています。推論効率を犠牲にすることなく、私たちのモデルは、セマンティックセグメンテーションの1ショットと5ショットの両方の設定で最先端の結果を実現します。

Few-shot semantic segmentation aims to segment novel-class objects in a given query image with only a few labeled support images. Most advanced solutions exploit a metric learning framework that performs segmentation through matching each query feature to a learned class-specific prototype. However, this framework suffers from biased classification due to incomplete feature comparisons. To address this issue, we present an adaptive prototype representation by introducing class-specific and class-agnostic prototypes and thus construct complete sample pairs for learning semantic alignment with query features. The complementary features learning manner effectively enriches feature comparison and helps yield an unbiased segmentation model in the few-shot setting. It is implemented with a two-branch end-to-end network (i.e. , a class-specific branch and a class-agnostic branch), which generates prototypes and then combines query features to perform comparisons. In addition, the proposed class-agnostic branch is simple yet effective. In practice, it can adaptively generate multiple class-agnostic prototypes for query images and learn feature alignment in a self-contrastive manner. Extensive experiments on PASCAL-5^i and COCO-20^i demonstrate the superiority of our method. At no expense of inference efficiency, our model achieves state-of-the-art results in both 1-shot and 5-shot settings for semantic segmentation.

updated: Wed Nov 24 2021 04:38:37 GMT+0000 (UTC)

published: Wed Nov 24 2021 04:38:37 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト