Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation

Chaohui Yu; Qiang Zhou; Jingliang Li; Jianlong Yuan; Zhibin Wang; Fan Wang

Foundation モデルは、セマンティックセグメンテーションの弱い増分学習を推進します

セマンティックセグメンテーション手法の最新の増分学習は、通常、密な注釈に基づいて新しいカテゴリを学習します。有望な結果が得られますが、ピクセルごとのラベル付けには費用と時間がかかります。セマンティックセグメンテーションの弱い増分学習 (WILSS) は、斬新で魅力的なタスクであり、安価で広く利用可能な画像レベルのラベルから新しいクラスをセグメント化することを学習することを目的としています。同等の結果にもかかわらず、画像レベルのラベルは各セグメントを特定するための詳細を提供できないため、WILSS のパフォーマンスが制限されます。これは、古いものを忘れないようにしながら、画像レベルのラベルが与えられた新しいクラスの監視を改善し、効果的に利用する方法を考えるように促します。この作業では、FMWISS という名前の WILSS 用の斬新でデータ効率の高いフレームワークを提案します。具体的には、密な疑似ラベルを生成するための補完的な基礎モデルの知識を抽出するために、事前トレーニングに基づくコセグメンテーションを提案します。教師と生徒のアーキテクチャを使用して、ノイズの多い疑似マスクをさらに最適化します。プラグインの教師は、提案された密なコントラスト損失で最適化されます。さらに、古いクラスの壊滅的な忘却問題を改善するために、メモリベースのコピーアンドペースト拡張を導入します。 Pascal VOC および COCO データセットに関する広範な実験では、フレームワークの優れたパフォーマンスが実証されています。それぞれ。

Modern incremental learning for semantic segmentation methods usually learn new categories based on dense annotations. Although achieve promising results, pixel-by-pixel labeling is costly and time-consuming. Weakly incremental learning for semantic segmentation (WILSS) is a novel and attractive task, which aims at learning to segment new classes from cheap and widely available image-level labels. Despite the comparable results, the image-level labels can not provide details to locate each segment, which limits the performance of WILSS. This inspires us to think how to improve and effectively utilize the supervision of new classes given image-level labels while avoiding forgetting old ones. In this work, we propose a novel and data-efficient framework for WILSS, named FMWISS. Specifically, we propose pre-training based co-segmentation to distill the knowledge of complementary foundation models for generating dense pseudo labels. We further optimize the noisy pseudo masks with a teacher-student architecture, where a plug-in teacher is optimized with a proposed dense contrastive loss. Moreover, we introduce memory-based copy-paste augmentation to improve the catastrophic forgetting problem of old classes. Extensive experiments on Pascal VOC and COCO datasets demonstrate the superior performance of our framework, e.g., FMWISS achieves 70.7% and 73.3% in the 15-5 VOC setting, outperforming the state-of-the-art method by 3.4% and 6.1%, respectively.

updated: Thu Apr 20 2023 08:12:44 GMT+0000 (UTC)

published: Tue Feb 28 2023 02:21:42 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト