SIN:Superpixel Interpolation Network

Qing Yuan; Songfeng Lu; Yan Huang; Wuxin Sha

SIN：スーパーピクセル補間ネットワーク

スーパーピクセルは、その表現効率と計算効率により、コンピュータービジョンタスクで広く使用されています。一方、ディープラーニングとエンドツーエンドのフレームワークは、コンピュータービジョンを含むさまざまな分野で大きな進歩を遂げました。ただし、既存のスーパーピクセルアルゴリズムを、エンドツーエンドの方法で後続のタスクに統合することはできません。従来のアルゴリズムと深層学習ベースのアルゴリズムは、スーパーピクセルセグメンテーションの2つの主要なストリームです。前者は微分不可能であり、後者は接続を強制するために微分不可能な後処理ステップを必要とし、スーパーピクセルとダウンストリームタスクの統合を制約します。この論文では、エンドツーエンドの方法でダウンストリームタスクと統合できるディープラーニングベースのスーパーピクセルセグメンテーションアルゴリズムSINを提案します。ビジュアルトラッキングなどの一部のダウンストリームタスクにはリアルタイムの速度が必要なため、スーパーピクセルの生成速度も重要です。後処理ステップを削除するために、私たちのアルゴリズムは最初から空間接続を強制します。スーパーピクセルはサンプリングされたピクセルによって初期化され、他のピクセルは複数の更新ステップを通じてスーパーピクセルに割り当てられます。各ステップは、水平方向と垂直方向の補間で構成されます。これは、空間接続を強制するための鍵です。完全畳み込みネットワークの多層出力は、内挿の関連スコアを予測するために利用されます。実験結果は、私たちのアプローチが約80 fpsで実行され、最先端の方法に対して良好に機能することを示しています。さらに、トレーニング時間を大幅に短縮する、シンプルで効果的な損失関数を設計します。スーパーピクセルベースのタスクの改善は、アルゴリズムの有効性を示しています。 SINがエンドツーエンドの方法でダウンストリームタスクに統合され、スーパーピクセルベースのコミュニティに利益をもたらすことを願っています。コードはhttps://github.com/yuanqqq/SINhttps://github.com/yuanqqq/SINで入手できます。

Superpixels have been widely used in computer vision tasks due to their representational and computational efficiency. Meanwhile, deep learning and end-to-end framework have made great progress in various fields including computer vision. However, existing superpixel algorithms cannot be integrated into subsequent tasks in an end-to-end way. Traditional algorithms and deep learning-based algorithms are two main streams in superpixel segmentation. The former is non-differentiable and the latter needs a non-differentiable post-processing step to enforce connectivity, which constraints the integration of superpixels and downstream tasks. In this paper, we propose a deep learning-based superpixel segmentation algorithm SIN which can be integrated with downstream tasks in an end-to-end way. Owing to some downstream tasks such as visual tracking require real-time speed, the speed of generating superpixels is also important. To remove the post-processing step, our algorithm enforces spatial connectivity from the start. Superpixels are initialized by sampled pixels and other pixels are assigned to superpixels through multiple updating steps. Each step consists of a horizontal and a vertical interpolation, which is the key to enforcing spatial connectivity. Multi-layer outputs of a fully convolutional network are utilized to predict association scores for interpolations. Experimental results show that our approach runs at about 80fps and performs favorably against state-of-the-art methods. Furthermore, we design a simple but effective loss function which reduces much training time. The improvements of superpixel-based tasks demonstrate the effectiveness of our algorithm. We hope SIN will be integrated into downstream tasks in an end-to-end way and benefit the superpixel-based community. Code is available at: https://github.com/yuanqqq/SINhttps://github.com/yuanqqq/SIN.

updated: Sun Oct 17 2021 02:21:11 GMT+0000 (UTC)

published: Sun Oct 17 2021 02:21:11 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト