Transfer beyond the Field of View: Dense Panoramic Semantic Segmentation via Unsupervised Domain Adaptation

Jiaming Zhang; Chaoxiang Ma; Kailun Yang; Alina Roitberg; Kunyu Peng; Rainer Stiefelhagen

視野を超えた転送：教師なしドメイン適応による高密度パノラマセマンティックセグメンテーション

自動運転車は、360度センサーの拡張された視野（FoV）の恩恵を明らかに受けていますが、最新のセマンティックセグメンテーションアプローチは、パノラマ画像ではめったに利用できない注釈付きトレーニングデータに大きく依存しています。ドメイン適応の観点からこの問題を検討し、パノラマセマンティックセグメンテーションを設定にもたらします。この設定では、ラベル付けされたトレーニングデータが従来のピンホールカメラ画像の異なる分布に由来します。これを達成するために、パノラマセマンティックセグメンテーションの教師なしドメイン適応のタスクを形式化し、DensePASSを収集します。これは、ピンホールからパノラマドメインへのシフトを研究するために特別に構築され、ピンホールを伴う、クロスドメイン条件下でのパノラマセグメンテーション用の新しい高密度注釈付きデータセットです。 Cityscapesから取得したカメラトレーニングの例。 DensePASSは、ラベル付きとラベルなしの両方の360度画像をカバーし、ラベル付きデータは、ソース（つまりピンホール）ドメインで利用可能なカテゴリに明示的に適合する19のクラスで構成されます。データ駆動型モデルはデータ分散の変化に特に影響を受けやすいため、P2PDAを導入します。これは、ピンホールからパノラマへのセマンティックセグメンテーションの汎用フレームワークであり、注意が強化されたドメイン適応モジュールのさまざまなバリアントによるドメイン分岐の課題に対処し、転送を可能にします。出力、機能、および機能の信頼空間で。 P2PDAは、予測が一致しないアテンションヘッドを介してオンザフライで規制される信頼値を使用して、不確実性を意識した適応を絡み合わせます。私たちのフレームワークは、ドメインの対応を学習する際のコンテキスト交換を容易にし、精度と効率に焦点を当てたモデルの適応パフォーマンスを劇的に改善します。包括的な実験により、私たちのフレームワークが教師なしドメイン適応および特殊なパノラマセグメンテーションアプローチを明らかに上回っていることを確認します。

Autonomous vehicles clearly benefit from the expanded Field of View (FoV) of 360-degree sensors, but modern semantic segmentation approaches rely heavily on annotated training data which is rarely available for panoramic images. We look at this problem from the perspective of domain adaptation and bring panoramic semantic segmentation to a setting, where labelled training data originates from a different distribution of conventional pinhole camera images. To achieve this, we formalize the task of unsupervised domain adaptation for panoramic semantic segmentation and collect DensePASS - a novel densely annotated dataset for panoramic segmentation under cross-domain conditions, specifically built to study the Pinhole-to-Panoramic domain shift and accompanied with pinhole camera training examples obtained from Cityscapes. DensePASS covers both, labelled- and unlabelled 360-degree images, with the labelled data comprising 19 classes which explicitly fit the categories available in the source (i.e. pinhole) domain. Since data-driven models are especially susceptible to changes in data distribution, we introduce P2PDA - a generic framework for Pinhole-to-Panoramic semantic segmentation which addresses the challenge of domain divergence with different variants of attention-augmented domain adaptation modules, enabling the transfer in output-, feature-, and feature confidence spaces. P2PDA intertwines uncertainty-aware adaptation using confidence values regulated on-the-fly through attention heads with discrepant predictions. Our framework facilitates context exchange when learning domain correspondences and dramatically improves the adaptation performance of accuracy- and efficiency-focused models. Comprehensive experiments verify that our framework clearly surpasses unsupervised domain adaptation- and specialized panoramic segmentation approaches.

updated: Thu Oct 21 2021 11:22:05 GMT+0000 (UTC)

published: Thu Oct 21 2021 11:22:05 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト