Augmentation-based Domain Generalization for Semantic Segmentation

Manuel Schwonberg; Fadoua El Bouazati; Nico M. Schmidt; Hanno Gottschalk

セマンティックセグメンテーションのための拡張ベースのドメイン一般化

教師なしドメイン適応 (UDA) とドメイン一般化 (DG) は、目に見えないドメインに対するディープニューラルネットワーク (DNN) の一般化の欠如に取り組むことを目的とした 2 つの研究分野です。 UDA メソッドはラベル付けされていないターゲットイメージにアクセスできますが、ドメインの一般化にはターゲットデータは含まれず、ソースドメインから一般化された機能のみを学習します。画像スタイルのランダム化または拡張は、ターゲットドメインにアクセスせずにネットワークの一般化を改善するための一般的なアプローチです。ドメイン外の一般化のための単純な画像拡張の可能性を無視する複雑な方法がしばしば提案されます。このため、ぼかし、ノイズ、色ジッターなどの単純なルールベースの画像拡張のドメイン内およびドメイン外の一般化機能を体系的に研究しています。実験計画の完全実施要因計画に基づいて、拡張とその相互作用の体系的な統計的評価を提供します。私たちの分析は、予想される結果と予想外の結果の両方を提供します。私たちの実験は、複数の異なる増強の組み合わせが単一の増強よりも優れているという共通の科学的基準を確認しているため、予想されます。組み合わせた増強は、最先端のドメイン一般化アプローチに匹敵するパフォーマンスを発揮すると同時に、はるかに単純であり、トレーニングのオーバーヘッドがないため、予想外です。 Synthia と Cityscapes の間の挑戦的な合成ドメインから現実ドメインへの移行では、最高の前作の 40.9% mIoU と比較して、39.5% mIoU に達します。さらに最近のビジョントランスフォーマーアーキテクチャ DAFormer を採用すると、44.2% mIoU のパフォーマンスでこれらのベンチマークを上回ります。

Unsupervised Domain Adaptation (UDA) and domain generalization (DG) are two research areas that aim to tackle the lack of generalization of Deep Neural Networks (DNNs) towards unseen domains. While UDA methods have access to unlabeled target images, domain generalization does not involve any target data and only learns generalized features from a source domain. Image-style randomization or augmentation is a popular approach to improve network generalization without access to the target domain. Complex methods are often proposed that disregard the potential of simple image augmentations for out-of-domain generalization. For this reason, we systematically study the in- and out-of-domain generalization capabilities of simple, rule-based image augmentations like blur, noise, color jitter and many more. Based on a full factorial design of experiment design we provide a systematic statistical evaluation of augmentations and their interactions. Our analysis provides both, expected and unexpected, outcomes. Expected, because our experiments confirm the common scientific standard that combination of multiple different augmentations out-performs single augmentations. Unexpected, because combined augmentations perform competitive to state-of-the-art domain generalization approaches, while being significantly simpler and without training overhead. On the challenging synthetic-to-real domain shift between Synthia and Cityscapes we reach 39.5% mIoU compared to 40.9% mIoU of the best previous work. When additionally employing the recent vision transformer architecture DAFormer we outperform these benchmarks with a performance of 44.2% mIoU

updated: Mon Apr 24 2023 14:26:53 GMT+0000 (UTC)

published: Mon Apr 24 2023 14:26:53 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト