Background Activation Suppression for Weakly Supervised Object Localization

Pingyu Wu; Wei Zhai; Yang Cao

弱教師ありオブジェクトのローカリゼーションのためのバックグラウンドアクティベーション抑制

弱教師ありオブジェクトローカリゼーション（WSOL）は、監視として画像レベルのラベルのみを使用してオブジェクト領域をローカライズすることを目的としています。最近、ローカリゼーションタスクを達成するために前景予測マップ（FPM）を生成することにより、新しいパラダイムが出現しました。既存のFPMベースの方法は、クロスエントロピー（CE）を使用して、前景予測マップを評価し、ジェネレーターの学習をガイドします。より効率的な学習を実現するためにアクティベーション値を使用することを主張します。これは、訓練されたネットワークの場合、前景マスクがオブジェクト領域の一部のみをカバーしている場合、CEがゼロに収束するという実験的観察に基づいています。一方、アクティブ化値は、マスクがオブジェクト境界まで拡張されるまで増加します。これは、アクティブ化値を使用することで、より多くのオブジェクト領域を学習できることを示しています。本論文では、バックグラウンド活性化抑制（BAS）法を提案する。具体的には、アクティベーションマップ制約モジュール（AMC）は、バックグラウンドのアクティベーション値を抑制することにより、ジェネレーターの学習を容易にするように設計されています。一方、前景領域ガイダンスと領域制約を使用することにより、BASはオブジェクトの領域全体を学習できます。さらに、推論フェーズでは、さまざまなカテゴリの予測マップを一緒に検討して、最終的なローカリゼーション結果を取得します。広範な実験により、BASは、CUB-200-2011およびILSVRCデータセットのベースラインメソッドよりも大幅かつ一貫した改善を達成していることが示されています。

Weakly supervised object localization (WSOL) aims to localize the object region using only image-level labels as supervision. Recently a new paradigm has emerged by generating a foreground prediction map (FPM) to achieve the localization task. Existing FPM-based methods use cross-entropy (CE) to evaluate the foreground prediction map and to guide the learning of generator. We argue for using activation value to achieve more efficient learning. It is based on the experimental observation that, for a trained network, CE converges to zero when the foreground mask covers only part of the object region. While activation value increases until the mask expands to the object boundary, which indicates that more object areas can be learned by using activation value. In this paper, we propose a Background Activation Suppression (BAS) method. Specifically, an Activation Map Constraint module (AMC) is designed to facilitate the learning of generator by suppressing the background activation values. Meanwhile, by using the foreground region guidance and the area constraint, BAS can learn the whole region of the object. Furthermore, in the inference phase, we consider the prediction maps of different categories together to obtain the final localization results. Extensive experiments show that BAS achieves significant and consistent improvement over the baseline methods on the CUB-200-2011 and ILSVRC datasets.

updated: Wed Dec 01 2021 15:53:40 GMT+0000 (UTC)

published: Wed Dec 01 2021 15:53:40 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト