Texture Learning Domain Randomization for Domain Generalized Segmentation

Sunghwan Kim; Dae-hwan Kim; Hoseong Kim

ドメインの一般化されたセグメンテーションのためのテクスチャ学習ドメインのランダム化

ソースドメインでトレーニングされたディープニューラルネットワーク (DNN) ベースのセマンティックセグメンテーションモデルは、目に見えないターゲットドメインに一般化するのに苦労することがよくあります。つまり、ドメインギャップの問題です。テクスチャはしばしばドメインギャップの一因となり、DNN はテクスチャバイアスになりやすいため、ドメインシフトに対して脆弱になります。既存の Domain Generalized Semantic Segmentation (DGSS) メソッドは、テクスチャよりも形状を優先するようにモデルを導くことで、ドメインギャップの問題を軽減してきました。一方、形状とテクスチャは、セマンティックセグメンテーションにおける 2 つの顕著な補完的な手がかりです。このホワイトペーパーでは、テクスチャを活用することが DGSS のパフォーマンスを向上させるために重要であると主張しています。具体的には、Texture Learning Domain Randomization (TLDR) という新しいフレームワークを提案します。 TLDR には、DGSS でのテクスチャ学習を効果的に強化するための 2 つの新しい損失が含まれています。(1) ImageNet の事前トレーニング済みモデルのテクスチャ機能を使用してソースドメインテクスチャへの過適合を防止するためのテクスチャ正則化損失と、(2) ランダムスタイルを利用するテクスチャ一般化損失です。画像を使用して、さまざまなテクスチャ表現を自己管理された方法で学習します。広範な実験結果は、提案された TLDR の優位性を示しています。たとえば、TLDR は、ResNet-50 を使用して GTA-to-Cityscapes で 46.5 mIoU を達成します。これは、従来の最先端の方法を 1.9 mIoU 改善します。

Deep Neural Networks (DNNs)-based semantic segmentation models trained on a source domain often struggle to generalize to unseen target domains, i.e., a domain gap problem. Texture often contributes to the domain gap, making DNNs vulnerable to domain shift because they are prone to be texture-biased. Existing Domain Generalized Semantic Segmentation (DGSS) methods have alleviated the domain gap problem by guiding models to prioritize shape over texture. On the other hand, shape and texture are two prominent and complementary cues in semantic segmentation. This paper argues that leveraging texture is crucial for improving performance in DGSS. Specifically, we propose a novel framework, coined Texture Learning Domain Randomization (TLDR). TLDR includes two novel losses to effectively enhance texture learning in DGSS: (1) a texture regularization loss to prevent overfitting to source domain textures by using texture features from an ImageNet pre-trained model and (2) a texture generalization loss that utilizes random style images to learn diverse texture representations in a self-supervised manner. Extensive experimental results demonstrate the superiority of the proposed TLDR; e.g., TLDR achieves 46.5 mIoU on GTA-to-Cityscapes using ResNet-50, which improves the prior state-of-the-art method by 1.9 mIoU.

updated: Tue Mar 21 2023 02:23:26 GMT+0000 (UTC)

published: Tue Mar 21 2023 02:23:26 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト