OVeNet: Offset Vector Network for Semantic Segmentation

Stamatis Alexandropoulos; Christos Sakaridis; Petros Maragos

OVeNet: セマンティックセグメンテーションのためのオフセットベクトルネットワーク

セマンティックセグメンテーションは、視覚的なシーンの理解における基本的なタスクです。グラウンドトゥルースのセマンティックアノテーションが利用できる、教師あり設定に焦点を当てます。現実世界のシーンの高い規則性に関する知識に基づいて、隣接するピクセルからの情報を選択的に活用することを学習することにより、クラス予測を改善する方法を提案します。特に、私たちの方法は、各ピクセルについて、前者と同じ予測を共有するシードピクセルがその近傍に存在するという事前確率に基づいています。この前例に動機付けられて、オフセットベクトルネットワーク (OVeNet) という名前の新しい 2 ヘッドネットワークを設計します。これは、標準的なセマンティック予測と、使用される各ピクセルからそれぞれのシードピクセルまでのオフセットを示す密な 2D オフセットベクトルフィールドの両方を生成します。別のシードベースのセマンティック予測を計算します。 2 つの予測は、予測されたオフセットベクトルフィールドの学習された密な信頼マップを使用して、各ピクセルで適応的に融合されます。シードベースの予測を最適化し、信頼マップの新しい損失を介して間接的にオフセットベクトルを監視します。 OVeNet が構築されているベースラインの最先端アーキテクチャである HRNet および HRNet+OCR と比較して、後者は、運転シーンのセマンティックセグメンテーションの 2 つの著名なベンチマークである Cityscapes と ACDC で大幅なパフォーマンスの向上を実現します。コードは https://github.com/stamatisalex/OVeNet で入手できます。

Semantic segmentation is a fundamental task in visual scene understanding. We focus on the supervised setting, where ground-truth semantic annotations are available. Based on knowledge about the high regularity of real-world scenes, we propose a method for improving class predictions by learning to selectively exploit information from neighboring pixels. In particular, our method is based on the prior that for each pixel, there is a seed pixel in its close neighborhood sharing the same prediction with the former. Motivated by this prior, we design a novel two-head network, named Offset Vector Network (OVeNet), which generates both standard semantic predictions and a dense 2D offset vector field indicating the offset from each pixel to the respective seed pixel, which is used to compute an alternative, seed-based semantic prediction. The two predictions are adaptively fused at each pixel using a learnt dense confidence map for the predicted offset vector field. We supervise offset vectors indirectly via optimizing the seed-based prediction and via a novel loss on the confidence map. Compared to the baseline state-of-the-art architectures HRNet and HRNet+OCR on which OVeNet is built, the latter achieves significant performance gains on two prominent benchmarks for semantic segmentation of driving scenes, namely Cityscapes and ACDC. Code is available at https://github.com/stamatisalex/OVeNet

updated: Sat Mar 25 2023 16:52:42 GMT+0000 (UTC)

published: Sat Mar 25 2023 16:52:42 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト