See More Than Once -- Kernel-Sharing Atrous Convolution for Semantic Segmentation

Ye Huang; Qingqing Wang; Wenjing Jia; Xiangjian He

複数回参照-セマンティックセグメンテーションのためのカーネル共有Atrousコンボリューション

最先端のセマンティックセグメンテーションソリューションは、通常、複数の並列ブランチを介してさまざまな受容フィールドを活用して、さまざまなサイズのオブジェクトを処理します。ただし、個々のブランチに個別のカーネルを使用すると、ネットワークの一般化および表現能力が低下し、パラメータの数はブランチの数に比例して増加します。この問題に対処するために、異なる受容フィールドのブランチが同じカーネルを共有する、つまり、単一のカーネルが異なる受容フィールドを持つ入力特徴マップを複数回見ることができる、Kernel-Sharing Atrous Convolution（KSAC）という新しいネットワーク構造を提案します、ブランチ間の通信を容易にし、ネットワーク内で機能拡張を実行します。ベンチマークPASCAL VOC 2012データセットで実施された実験は、提案された共有戦略がネットワークの一般化および表現能力を高めるだけでなく、モデルの複雑さを大幅に削減できることを示しています。具体的には、MobileNetv2バックボーンを装備したDeepLabV3 +と比較した検証セットでは、パラメーターの33％が削減され、mIOUが0.6％改善されました。 Xceptionをバックボーンとして使用すると、mIOUは83.34％から85.96％に上昇し、約10Mのパラメーターが保存されます。さらに、広く使用されているASPP構造とは異なり、提案されているKSACは、より大きな心拍数でより広いコンテキストを活用することで、mIOUをさらに改善することができます。最後に、KSACはPASCAL VOC 2012テストセットとADE20Kデータセットでそれぞれ88.1％と45.47％のmIOUを達成しています。完全なコードはGithubでリリースされます。

The state-of-the-art semantic segmentation solutions usually leverage different receptive fields via multiple parallel branches to handle objects with different sizes. However, employing separate kernels for individual branches degrades the generalization and representation abilities of the network, and the number of parameters increases linearly in the number of branches. To tackle this problem, we propose a novel network structure namely Kernel-Sharing Atrous Convolution (KSAC), where branches of different receptive fields share the same kernel, i.e., let a single kernel see the input feature maps more than once with different receptive fields, to facilitate communication among branches and perform feature augmentation inside the network. Experiments conducted on the benchmark PASCAL VOC 2012 dataset show that the proposed sharing strategy can not only boost a network s generalization and representation abilities but also reduce the model complexity significantly. Specifically, on the validation set, whe compared with DeepLabV3+ equipped with MobileNetv2 backbone, 33% of parameters are reduced together with an mIOU improvement of 0.6%. When Xception is used as the backbone, the mIOU is elevated from 83.34% to 85.96% with about 10M parameters saved. In addition, different from the widely used ASPP structure, our proposed KSAC is able to further improve the mIOU by taking benefit of wider context with larger atrous rates. Finally, our KSAC achieves mIOUs of 88.1% and 45.47% on the PASCAL VOC 2012 test set and ADE20K dataset, respectively. Our full code will be released on the Github.

updated: Sat Nov 16 2019 06:43:56 GMT+0000 (UTC)

published: Mon Aug 26 2019 03:01:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト