Towards Precise Weakly Supervised Object Detection via Interactive Contrastive Learning of Context Information

Qi Lai; ChiMan Vong

コンテキスト情報のインタラクティブな対比学習による正確な弱教師付きオブジェクト検出に向けて

弱教師付きオブジェクト検出 (WSOD) は、画像レベルのタグのみを使用して正確なオブジェクト検出器を学習することを目的としています。過去数年間の深層学習 (DL) アプローチに関する集中的な研究にもかかわらず、WSOD と完全に監視されたオブジェクト検出の間にはまだ大きなパフォーマンスギャップがあります。実際、ほとんどの既存の WSOD メソッドは、各領域提案の視覚的外観のみを考慮し、画像内の有用なコンテキスト情報を使用することを無視しています。この目的のために、このホワイトペーパーでは、JLWSOD と呼ばれるインタラクティブなエンドツーエンドの WSDO フレームワークを 2 つの革新とともに提案します。フレームワーク; ii) 対話型グラフ対照学習 (iGCL) メカニズムは、視覚的外観とコンテキスト情報を共同で最適化して、WSOD パフォーマンスを向上させるように設計されています。具体的には、iGCL メカニズムは WSOD の補完的な解釈、つまりインスタンス単位の検出タスクとセマンティック単位の予測タスクを最大限に活用して、より包括的なソリューションを形成します。広く使用されている PASCAL VOC および MS COCO ベンチマークでの広範な実験により、代替の最先端のアプローチおよびベースラインモデルに対する JLWSOD の優位性が検証されています (それぞれ、mAP で 3.6% ～ 23.3%、CorLoc で 3.4% ～ 19.7% の改善)。）。

Weakly supervised object detection (WSOD) aims at learning precise object detectors with only image-level tags. In spite of intensive research on deep learning (DL) approaches over the past few years, there is still a significant performance gap between WSOD and fully supervised object detection. In fact, most existing WSOD methods only consider the visual appearance of each region proposal but ignore employing the useful context information in the image. To this end, this paper proposes an interactive end-to-end WSDO framework called JLWSOD with two innovations: i) two types of WSOD-specific context information (i.e., instance-wise correlation andsemantic-wise correlation) are proposed and introduced into WSOD framework; ii) an interactive graph contrastive learning (iGCL) mechanism is designed to jointly optimize the visual appearance and context information for better WSOD performance. Specifically, the iGCL mechanism takes full advantage of the complementary interpretations of the WSOD, namely instance-wise detection and semantic-wise prediction tasks, forming a more comprehensive solution. Extensive experiments on the widely used PASCAL VOC and MS COCO benchmarks verify the superiority of JLWSOD over alternative state-of-the-art approaches and baseline models (improvement of 3.6%~23.3% on mAP and 3.4%~19.7% on CorLoc, respectively).

updated: Fri May 05 2023 10:08:26 GMT+0000 (UTC)

published: Thu Apr 27 2023 11:54:41 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト