A New Local Transformation Module for Few-shot Segmentation

Yuwei Yang; Fanman Meng; Hongliang Li; Qingbo Wu; Xiaolong Xu; Shuai Chen

少数ショットのセグメンテーションのための新しいローカル変換モジュール

少数ショットのセグメンテーションは、新しいクラスのオブジェクト領域をいくつかの手動注釈でセグメント化します。その重要なステップは、サポート画像（注釈付き画像）とクエリ画像（ラベルなし画像）間の変換モジュールを確立し、サポート画像のセグメンテーションキューがクエリ画像のセグメンテーションをガイドできるようにすることです。既存の方法は、グローバルキューに基づいて変換モデルを形成しますが、この論文で変換にとって非常に重要であることが確認されたローカルキューを無視します。この論文は、局所的な手がかりに基づく新しい変換モジュールを提案します。そこでは、局所的な特徴の関係が変換に使用されます。ネットワークの一般化パフォーマンスを強化するために、コサイン距離に基づいて高次元のメトリック埋め込みスペースで関係マトリックスが計算されます。さらに、低レベルのローカルな関係から高レベルのセマンティックキューへの困難なマッピング問題を処理するために、サポートイメージの注釈行列の一般化逆行列を適用して、関係行列を線形に変換することを提案します。クラスに依存しない。マトリックス変換による結果は、高レベルのセマンティックキューを備えたアテンションマップと見なすことができ、これに基づいて変換モジュールを簡単に構築できます。提案された変換モジュールは、変換モジュールを置き換えるために使用できる汎用モジュールです。既存の少数ショットのセグメンテーションフレームワーク。 Pascal VOC 2012データセットで提案された方法の有効性を検証します。 mIoUの値は、1ショットで57.0％、5ショットで60.6％に達し、それぞれ最先端の方法よりも1.6％および3.5％優れています。

Few-shot segmentation segments object regions of new classes with a few of manual annotations. Its key step is to establish the transformation module between support images (annotated images) and query images (unlabeled images), so that the segmentation cues of support images can guide the segmentation of query images. The existing methods form transformation model based on global cues, which however ignores the local cues that are verified in this paper to be very important for the transformation. This paper proposes a new transformation module based on local cues, where the relationship of the local features is used for transformation. To enhance the generalization performance of the network, the relationship matrix is calculated in a high-dimensional metric embedding space based on cosine distance. In addition, to handle the challenging mapping problem from the low-level local relationships to high-level semantic cues, we propose to apply generalized inverse matrix of the annotation matrix of support images to transform the relationship matrix linearly, which is non-parametric and class-agnostic. The result by the matrix transformation can be regarded as an attention map with high-level semantic cues, based on which a transformation module can be built simply.The proposed transformation module is a general module that can be used to replace the transformation module in the existing few-shot segmentation frameworks. We verify the effectiveness of the proposed method on Pascal VOC 2012 dataset. The value of mIoU achieves at 57.0% in 1-shot and 60.6% in 5-shot, which outperforms the state-of-the-art method by 1.6% and 3.5%, respectively.

updated: Tue Nov 12 2019 01:02:48 GMT+0000 (UTC)

published: Mon Oct 14 2019 01:52:21 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト