Plug and Play Active Learning for Object Detection

Chenhongyi Yang; Lichao Huang; Elliot J. Crowley

オブジェクト検出のためのプラグアンドプレイアクティブラーニング

教師あり学習のデータに注釈を付けるのは費用がかかり、面倒なので、できる限り少なくしたいと考えています。与えられた「アノテーションバジェット」を最大限に活用するために、アノテーション用のデータセット内で最も有益なサンプルを特定することを目的としたアクティブラーニング (AL) に目を向けることができます。アクティブラーニングアルゴリズムは通常、不確実性ベースまたは多様性ベースです。どちらも画像分類では成功していますが、オブジェクト検出に関しては不十分です。これは次の理由によると仮定します。(1) オブジェクト検出の不確実性を定量化することは困難です。これは、ローカリゼーションと分類の両方で構成されているためです。ローカライズが難しいクラスもあれば、分類が難しいクラスもあります。 (2) 画像に含まれるオブジェクトの数が異なる場合、多様性ベースの AL の類似性を測定することは困難です。これらの困難を克服する 2 段階のアクティブラーニングアルゴリズム Plug and Play Active Learning (PPAL) を提案します。これは、（1）難易度調整済み不確実性サンプリングで構成されます。ここでは、分類とローカリゼーションの両方を考慮したカテゴリごとの難易度係数を使用して、不確実性に基づくサンプリングのオブジェクトの不確実性を再重み付けします。 (2) 複数インスタンス画像の類似性をそれらのインスタンス類似性のアンサンブルとして計算するためのカテゴリ条件付きマッチング類似性。 PPAL は、モデルアーキテクチャや検出器トレーニングパイプラインを変更しないため、非常に一般化可能です。さまざまな検出器アーキテクチャを使用して、MS-COCO および Pascal VOC データセットで PPAL のベンチマークを行い、この方法が以前の最先端技術よりも優れていることを示します。コードは https://github.com/ChenhongyiYang/PPAL で入手できます

Annotating data for supervised learning is expensive and tedious, and we want to do as little of it as possible. To make the most of a given "annotation budget" we can turn to active learning (AL) which aims to identify the most informative samples in a dataset for annotation. Active learning algorithms are typically uncertainty-based or diversity-based. Both have seen success in image classification, but fall short when it comes to object detection. We hypothesise that this is because: (1) it is difficult to quantify uncertainty for object detection as it consists of both localisation and classification, where some classes are harder to localise, and others are harder to classify; (2) it is difficult to measure similarities for diversity-based AL when images contain different numbers of objects. We propose a two-stage active learning algorithm Plug and Play Active Learning (PPAL) that overcomes these difficulties. It consists of (1) Difficulty Calibrated Uncertainty Sampling, in which we used a category-wise difficulty coefficient that takes both classification and localisation into account to re-weight object uncertainties for uncertainty-based sampling; (2) Category Conditioned Matching Similarity to compute the similarities of multi-instance images as ensembles of their instance similarities. PPAL is highly generalisable because it makes no change to model architectures or detector training pipelines. We benchmark PPAL on the MS-COCO and Pascal VOC datasets using different detector architectures and show that our method outperforms the prior state-of-the-art. Code is available at https://github.com/ChenhongyiYang/PPAL

updated: Mon Nov 21 2022 16:13:23 GMT+0000 (UTC)

published: Mon Nov 21 2022 16:13:23 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト