WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection

Shijie Fang; Yuhang Cao; Xinjiang Wang; Kai Chen; Dahua Lin; Wayne Zhang

WSSOD：弱教師ありおよび半教師ありオブジェクト検出のための新しいパイプライン

オブジェクト検出のパフォーマンスは、大きな注釈付きデータセットの可用性に大きく依存します。注釈のコストを軽減するために、研究コミュニティは、ラベルのないデータまたはラベルの弱いデータを活用するためのいくつかの方法を模索してきました。しかし、そのような努力はこれまでのところ限られた成功しか収めていません。この作業では、実用的な観点から問題を再検討し、完全な注釈付きデータと弱い注釈付きデータを共同で活用することにより、検出パフォーマンスと注釈コストの新しいバランスを探ります。具体的には、2段階の学習手順を含む弱監視および半教師ありオブジェクト検出フレームワーク（WSSOD）を提案します。エージェント検出器は、最初に共同データセットでトレーニングされ、次に弱く注釈が付けられた画像の疑似境界ボックスを予測するために使用されます。現在および一般的な半教師ありパイプラインの基礎となる仮定も、統一されたEM定式化の下で注意深く調べられます。このフレームワークに加えて、弱教師あり損失（WSL）、ラベル注意、およびランダム疑似ラベルサンプリング（RPS）戦略が導入され、これらの仮定が緩和され、検出パイプラインの有効性がさらに向上します。提案されたフレームワークは、PASCAL-VOCおよびMSCOCOベンチマークで優れたパフォーマンスを示し、注釈の3分の1だけで、完全に監視された設定で得られるものに匹敵する高いパフォーマンスを実現します。

The performance of object detection, to a great extent, depends on the availability of large annotated datasets. To alleviate the annotation cost, the research community has explored a number of ways to exploit unlabeled or weakly labeled data. However, such efforts have met with limited success so far. In this work, we revisit the problem with a pragmatic standpoint, trying to explore a new balance between detection performance and annotation cost by jointly exploiting fully and weakly annotated data. Specifically, we propose a weakly- and semi-supervised object detection framework (WSSOD), which involves a two-stage learning procedure. An agent detector is first trained on a joint dataset and then used to predict pseudo bounding boxes on weakly-annotated images. The underlying assumptions in the current as well as common semi-supervised pipelines are also carefully examined under a unified EM formulation. On top of this framework, weakly-supervised loss (WSL), label attention and random pseudo-label sampling (RPS) strategies are introduced to relax these assumptions, bringing additional improvement on the efficacy of the detection pipeline. The proposed framework demonstrates remarkable performance on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to those obtained in fully-supervised settings, with only one third of the annotations.

updated: Fri May 21 2021 11:58:50 GMT+0000 (UTC)

published: Fri May 21 2021 11:58:50 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト