Reducing Annotation Effort by Identifying and Labeling Contextually Diverse Classes for Semantic Segmentation Under Domain Shift

Sharat Agarwal; Saket Anand; Chetan Arora

ドメインシフト下でのセマンティックセグメンテーションのために文脈的に多様なクラスを識別してラベル付けすることにより、アノテーションの労力を削減する

アクティブドメインアダプテーション (ADA) では、アクティブラーニング (AL) を使用してターゲットドメインから画像のサブセットを選択します。これは、教師ありドメインアダプテーション (DA) に注釈を付けて使用します。教師あり DA 手法と教師なし DA 手法の間の大きなパフォーマンスギャップを考えると、ADA は、アノテーションのコストとパフォーマンスの間の優れたトレードオフを可能にします。先行技術は、不確実性またはモデルの不一致の尺度を利用して、人間のオラクルによって注釈を付けられるべき「領域」を識別する。ただし、これらの領域は、注釈を付けるのが難しく面倒なオブジェクト境界のピクセルで構成されていることがよくあります。したがって、注釈が付けられた画像ピクセルの割合が減少したとしても、全体的な注釈時間とその結果のコストは依然として高いままです。この作業では、与えられたフレームで、モデルが正確に予測するのが最も難しい一連のクラスを特定する ADA 戦略を提案し、選択したフレームで意味的に意味のある領域に注釈を付けることを推奨します。これらの「ハード」クラスのセットはコンテキスト依存であり、通常はフレーム間で変化し、注釈を付けるとモデルの一般化が向上することを示します。現在のトレーニングセットのコンテキストで補完的で多様な領域を選択するためのアンカーベースとオーグメンテーションベースのアプローチの 2 つの ADA 手法を提案します。私たちのアプローチは、5% のアノテーションを使用する MADA による 64.9 mIoU と比較して、4.7% のアノテーションバジェットで GTA to Cityscapes データセットで 66.6 mIoU を達成します。私たちの手法は、既存のフレームベースの AL 手法のデコレータとしても使用できます。たとえば、このアプローチを使用すると、Cityscapes の CDAL で 1.5% のパフォーマンス向上が報告されています。

In Active Domain Adaptation (ADA), one uses Active Learning (AL) to select a subset of images from the target domain, which are then annotated and used for supervised domain adaptation (DA). Given the large performance gap between supervised and unsupervised DA techniques, ADA allows for an excellent trade-off between annotation cost and performance. Prior art makes use of measures of uncertainty or disagreement of models to identify `regions' to be annotated by the human oracle. However, these regions frequently comprise of pixels at object boundaries which are hard and tedious to annotate. Hence, even if the fraction of image pixels annotated reduces, the overall annotation time and the resulting cost still remain high. In this work, we propose an ADA strategy, which given a frame, identifies a set of classes that are hardest for the model to predict accurately, thereby recommending semantically meaningful regions to be annotated in a selected frame. We show that these set of `hard' classes are context-dependent and typically vary across frames, and when annotated help the model generalize better. We propose two ADA techniques: the Anchor-based and Augmentation-based approaches to select complementary and diverse regions in the context of the current training set. Our approach achieves 66.6 mIoU on GTA to Cityscapes dataset with an annotation budget of 4.7% in comparison to 64.9 mIoU by MADA using 5% of annotations. Our technique can also be used as a decorator for any existing frame-based AL technique, e.g., we report 1.5% performance improvement for CDAL on Cityscapes using our approach.

updated: Thu Oct 13 2022 05:23:47 GMT+0000 (UTC)

published: Thu Oct 13 2022 05:23:47 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト