Texture Learning Domain Randomization for Domain Generalized Segmentation

Sunghwan Kim; Dae-hwan Kim; Hoseong Kim

ドメイン一般化セグメンテーションのためのテクスチャ学習ドメインのランダム化

ソースドメインでトレーニングされたディープニューラルネットワーク (DNN) ベースのセマンティックセグメンテーションモデルは、多くの場合、目に見えないターゲットドメイン、つまりドメインギャップの問題に一般化するのに苦労します。テクスチャはドメインギャップの原因となることが多く、DNN はテクスチャに偏りやすいため、ドメインシフトに対して脆弱になります。既存のドメイン一般化セマンティックセグメンテーション (DGSS) 手法は、テクスチャよりも形状を優先するようにモデルを誘導することで、ドメインギャップの問題を軽減してきました。一方、形状とテクスチャは、セマンティックセグメンテーションにおける 2 つの顕著な補完的な手がかりです。この論文では、DGSS のパフォーマンスを向上させるにはテクスチャの活用が重要であると主張しています。具体的には、Texture Learning Domain Randomization (TLDR) という造語という新しいフレームワークを提案します。 TLDR には、DGSS でのテクスチャ学習を効果的に強化するための 2 つの新しい損失が含まれています。(1) ImageNet 事前トレーニングモデルのテクスチャ特徴を使用することで、ソースドメインテクスチャへのオーバーフィッティングを防ぐテクスチャ正則化損失と、(2) ランダムスタイルを利用するテクスチャ一般化損失画像を使用して、自己教師付きの方法で多様なテクスチャ表現を学習します。広範な実験結果により、提案された TLDR の優位性が実証されています。たとえば、TLDR は、ResNet-50 を使用して GTA-to-Cityscapes で 46.5 mIoU を達成し、従来の最先端の方法を 1.9 mIoU 改善します。ソースコードは https://github.com/ssssshwan/TLDR で入手できます。

Deep Neural Networks (DNNs)-based semantic segmentation models trained on a source domain often struggle to generalize to unseen target domains, i.e., a domain gap problem. Texture often contributes to the domain gap, making DNNs vulnerable to domain shift because they are prone to be texture-biased. Existing Domain Generalized Semantic Segmentation (DGSS) methods have alleviated the domain gap problem by guiding models to prioritize shape over texture. On the other hand, shape and texture are two prominent and complementary cues in semantic segmentation. This paper argues that leveraging texture is crucial for improving performance in DGSS. Specifically, we propose a novel framework, coined Texture Learning Domain Randomization (TLDR). TLDR includes two novel losses to effectively enhance texture learning in DGSS: (1) a texture regularization loss to prevent overfitting to source domain textures by using texture features from an ImageNet pre-trained model and (2) a texture generalization loss that utilizes random style images to learn diverse texture representations in a self-supervised manner. Extensive experimental results demonstrate the superiority of the proposed TLDR; e.g., TLDR achieves 46.5 mIoU on GTA-to-Cityscapes using ResNet-50, which improves the prior state-of-the-art method by 1.9 mIoU. The source code is available at https://github.com/ssssshwan/TLDR.

updated: Thu Aug 17 2023 10:39:37 GMT+0000 (UTC)

published: Tue Mar 21 2023 02:23:26 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト