AINet: Association Implantation for Superpixel Segmentation

Yaxiong Wang; Yunchao Wei; Xueming Qian; Li Zhu; Yi Yang

AINet：スーパーピクセルセグメンテーションのためのアソシエーション埋め込み

最近、スーパーピクセルセグメンテーションを容易にするために深い畳み込みネットワークを利用するためのいくつかのアプローチが提案されています。一般的な方法は、最初に画像を事前定義された数のグリッドに均等に分割し、次に各ピクセルをその周囲のグリッドに関連付けることを学習することです。ただし、受容野が制限された一連の畳み込み演算を適用するだけでは、ピクセルとその周囲のグリッドとの関係を暗黙的にしか認識できません。その結果、既存の方法では、関連付けマップを推測するときに効果的なコンテキストを提供できないことがよくあります。この問題を解決するために、ネットワークがピクセルとその周囲のグリッドとの間の関係を明示的にキャプチャできるようにする、新しいアソシエーションインプランテーション（AI）モジュールを提案します。提案されたAIモジュールは、グリッドセルの機能を対応する中央ピクセルの周囲に直接埋め込み、パッド付きウィンドウで畳み込みを実行して、それらの間で知識を適応的に転送します。このような埋め込み操作により、ネットワークはピクセルグリッドレベルのコンテキストを明示的に収集できます。これは、ピクセル単位の関係と比較して、スーパーピクセルセグメンテーションのターゲットとより一致しています。さらに、より良い境界精度を追求するために、ネットワークが隠れた特徴レベルで境界の周りのピクセルを区別するのに役立つ境界知覚損失を設計します。これは、後続の推論モジュールがより多くの境界ピクセルを正確に識別するのに役立ちます。 BSDS500およびNYUv2データセットでの広範な実験は、私たちの方法が最先端のパフォーマンスを達成するだけでなく、満足のいく推論効率を維持できることを示しています。

Recently, some approaches are proposed to harness deep convolutional networks to facilitate superpixel segmentation. The common practice is to first evenly divide the image into a pre-defined number of grids and then learn to associate each pixel with its surrounding grids. However, simply applying a series of convolution operations with limited receptive fields can only implicitly perceive the relations between the pixel and its surrounding grids. Consequently, existing methods often fail to provide an effective context when inferring the association map. To remedy this issue, we propose a novel Association Implantation (AI) module to enable the network to explicitly capture the relations between the pixel and its surrounding grids. The proposed AI module directly implants the features of grid cells to the surrounding of its corresponding central pixel, and conducts convolution on the padded window to adaptively transfer knowledge between them. With such an implantation operation, the network could explicitly harvest the pixel-grid level context, which is more in line with the target of superpixel segmentation comparing to the pixel-wise relation. Furthermore, to pursue better boundary precision, we design a boundary-perceiving loss to help the network discriminate the pixels around boundaries in hidden feature level, which could benefit the subsequent inferring modules to accurately identify more boundary pixels. Extensive experiments on BSDS500 and NYUv2 datasets show that our method could not only achieve state-of-the-art performance but maintain satisfactory inference efficiency.

updated: Mon Aug 02 2021 08:26:00 GMT+0000 (UTC)

published: Tue Jan 26 2021 10:40:13 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト