Generating Superpixels for High-resolution Images with Decoupled Patch Calibration

Yaxiong Wang; Yunchao Wei; Xueming Qian; Li Zhu; Yi Yang

分離パッチキャリブレーションによる高解像度画像のスーパーピクセルの生成

スーパーピクセルセグメンテーションは、微分可能な深層学習の進歩から恩恵を受ける重要な進歩を最近見ています。ただし、メモリと計算コストが高いため、非常に高解像度のスーパーピクセルセグメンテーションは依然として困難であり、現在の高度なスーパーピクセルネットワークは処理に失敗します。この論文では、高解像度のスーパーピクセルセグメンテーションを効率的かつ正確に実装することを目的として、パッチキャリブレーションネットワーク（PCNet）を考案します。 PCNetは、GPUメモリを節約し、計算コストを削減するために、低解像度入力から高解像度出力を生成するという原則に従います。ダウンサンプリング操作によって破壊された細部を思い出すために、メインのスーパーピクセル生成ブランチを共同で拡張するための新しい分離パッチキャリブレーション（DPC）ブランチを提案します。特に、DPCは高解像度画像からローカルパッチを取得し、バイナリマスクを動的に生成して、ネットワークに領域境界に焦点を合わせるように強制します。 DPCとメインブランチのパラメータを共有することにより、高解像度パッチから学んだ詳細な知識が転送され、破壊された情報の調整に役立ちます。私たちの知る限り、高解像度の場合の深層学習ベースのスーパーピクセル生成を検討する最初の試みを行います。この研究を容易にするために、2つの公開データセットと1つの新しく構築されたデータセットから評価ベンチマークを構築し、きめの細かい人間の部分から都市の景観まで幅広い多様性をカバーします。広範な実験により、当社のPCNetは、定量的な結果で最先端の性能を発揮できるだけでなく、1080TiGPUの解像度の上限を3Kから5Kに改善できることが実証されています。

Superpixel segmentation has recently seen important progress benefiting from the advances in differentiable deep learning. However, the very high-resolution superpixel segmentation still remains challenging due to the expensive memory and computation cost, making the current advanced superpixel networks fail to process. In this paper, we devise Patch Calibration Networks (PCNet), aiming to efficiently and accurately implement high-resolution superpixel segmentation. PCNet follows the principle of producing high-resolution output from low-resolution input for saving GPU memory and relieving computation cost. To recall the fine details destroyed by the down-sampling operation, we propose a novel Decoupled Patch Calibration (DPC) branch for collaboratively augment the main superpixel generation branch. In particular, DPC takes a local patch from the high-resolution images and dynamically generates a binary mask to impose the network to focus on region boundaries. By sharing the parameters of DPC and main branches, the fine-detailed knowledge learned from high-resolution patches will be transferred to help calibrate the destroyed information. To the best of our knowledge, we make the first attempt to consider the deep-learning-based superpixel generation for high-resolution cases. To facilitate this research, we build evaluation benchmarks from two public datasets and one new constructed one, covering a wide range of diversities from fine-grained human parts to cityscapes. Extensive experiments demonstrate that our PCNet can not only perform favorably against the state-of-the-arts in the quantitative results but also improve the resolution upper bound from 3K to 5K on 1080Ti GPUs.

updated: Mon Aug 23 2021 14:28:41 GMT+0000 (UTC)

published: Thu Aug 19 2021 10:33:05 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト