Towards Online Domain Adaptive Object Detection

Vibashan VS; Poojan Oza; Vishal M. Patel

オンラインドメイン適応オブジェクト検出に向けて

既存のオブジェクト検出モデルは、トレーニングデータとテストデータの両方が同じソースドメインからサンプリングされていることを前提としています。これらの検出器が新しいビジュアルドメインに遭遇する実際のアプリケーションに展開されている場合、この仮定は当てはまりません。教師なしドメイン適応（UDA）法は、ドメインシフトによって引き起こされる悪影響を軽減するために一般的に使用されます。既存のUDAメソッドはオフラインで動作し、モデルは最初にターゲットドメインに適合され、次に実際のアプリケーションにデプロイされます。ただし、このオフライン適応戦略は、モデルが新しいドメインシフトに頻繁に遭遇するため、実際のアプリケーションには適していません。したがって、展開時に発生するこれらのドメインシフトを継続的なオンライン方式で一般化する実行可能なUDAメソッドを開発することが重要になります。この目的のために、オンライン設定でターゲットドメインの一般化を適応および改善する新しい統合適応フレームワークを提案します。特に、MemXformerを紹介します。これは、メモリ内のアイテムがドメインシフトを利用し、ターゲット分布の典型的なパターンを記録する、クロスアテンショントランスベースのメモリモジュールです。さらに、MemXformerは、強力な正と負のペアを生成して、ターゲット固有の表現学習を強化する新しい対照的な損失を導きます。多様な検出ベンチマークでの実験は、提案された戦略がオンラインとオフラインの両方の設定で最先端のパフォーマンスを生み出すことができることを示しています。私たちの知る限り、これはオブジェクト検出のためのオンラインおよびオフラインの適応設定に取り組む最初の作業です。 https://github.com/Vibashan/online-odのコード

Existing object detection models assume both the training and test data are sampled from the same source domain. This assumption does not hold true when these detectors are deployed in real-world applications, where they encounter new visual domain. Unsupervised Domain Adaptation (UDA) methods are generally employed to mitigate the adverse effects caused by domain shift. Existing UDA methods operate in an offline manner where the model is first adapted towards the target domain and then deployed in real-world applications. However, this offline adaptation strategy is not suitable for real-world applications as the model frequently encounters new domain shifts. Hence, it becomes critical to develop a feasible UDA method that generalizes to these domain shifts encountered during deployment time in a continuous online manner. To this end, we propose a novel unified adaptation framework that adapts and improves generalization on the target domain in online settings. In particular, we introduce MemXformer - a cross-attention transformer-based memory module where items in the memory take advantage of domain shifts and record prototypical patterns of the target distribution. Further, MemXformer produces strong positive and negative pairs to guide a novel contrastive loss, which enhances target specific representation learning. Experiments on diverse detection benchmarks show that the proposed strategy can produce state-of-the-art performance in both online and offline settings. To the best of our knowledge, this is the first work to address online and offline adaptation settings for object detection. Code at https://github.com/Vibashan/online-od

updated: Mon Apr 11 2022 17:47:22 GMT+0000 (UTC)

published: Mon Apr 11 2022 17:47:22 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト