Implicit Feature Refinement for Instance Segmentation

Lufan Ma; Tiancai Wang; Bin Dong; Jiangpeng Yan; Xiu Li; Xiangyu Zhang

インスタンスセグメンテーションのための暗黙的な機能の改良

高品質のインスタンスセグメンテーションのための新しい暗黙の特徴改良モジュールを提案します。既存の画像/ビデオインスタンスのセグメンテーション方法は、明示的にスタックされた畳み込みに依存して、最終的な予測の前にインスタンスの機能を改良します。この論文では、最初にさまざまな改良戦略の経験的比較を行います。これにより、広く使用されている4つの連続した畳み込みは必要ないことがわかります。別の方法として、ウェイトシェアリングコンボリューションブロックは競争力のあるパフォーマンスを提供します。このようなブロックが無限に繰り返されると、ブロック出力は最終的に平衡状態に収束します。この観察に基づいて、陰関数の改良（IFR）は、陰関数を構築することによって開発されます。インスタンスフィーチャの平衡状態は、シミュレートされた無限深度ネットワークを介した固定小数点反復によって取得できます。私たちのIFRには、いくつかの利点があります。1）単一の残余ブロックのパラメーターのみを必要とし、無限の深さのリファインメントネットワークをシミュレートします。 2）グローバル受容野の高水準均衡インスタンスの特徴を生み出す。 3）ほとんどのオブジェクト認識フレームワークに簡単に拡張できるプラグアンドプレイの一般的なモジュールとして機能します。 COCOおよびYouTube-VISベンチマークでの実験では、IFRが、パラメーターの負担を軽減しながら、最先端の画像/ビデオインスタンスセグメンテーションフレームワークでパフォーマンスを向上させることが示されています（たとえば、マスクR-CNNでのAPの改善はわずか30.0％です）。マスクヘッドのパラメータ）。コードはhttps://github.com/lufanma/IFR.gitで入手できます。

We propose a novel implicit feature refinement module for high-quality instance segmentation. Existing image/video instance segmentation methods rely on explicitly stacked convolutions to refine instance features before the final prediction. In this paper, we first give an empirical comparison of different refinement strategies,which reveals that the widely-used four consecutive convolutions are not necessary. As an alternative, weight-sharing convolution blocks provides competitive performance. When such block is iterated for infinite times, the block output will eventually convergeto an equilibrium state. Based on this observation, the implicit feature refinement (IFR) is developed by constructing an implicit function. The equilibrium state of instance features can be obtained by fixed-point iteration via a simulated infinite-depth network. Our IFR enjoys several advantages: 1) simulates an infinite-depth refinement network while only requiring parameters of single residual block; 2) produces high-level equilibrium instance features of global receptive field; 3) serves as a plug-and-play general module easily extended to most object recognition frameworks. Experiments on the COCO and YouTube-VIS benchmarks show that our IFR achieves improved performance on state-of-the-art image/video instance segmentation frameworks, while reducing the parameter burden (e.g.1% AP improvement on Mask R-CNN with only 30.0% parameters in mask head). Code is made available at https://github.com/lufanma/IFR.git

updated: Thu Dec 09 2021 05:36:04 GMT+0000 (UTC)

published: Thu Dec 09 2021 05:36:04 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト