Object-Level Targeted Selection via Deep Template Matching

Suraj Kothawade; Donna Roy; Michele Fenzi; Elmar Haussmann; Jose M. Alvarez; Christoph Angerer

ディープテンプレートマッチングによるオブジェクトレベルのターゲット選択

クエリ画像内の対象オブジェクト（OOI）に意味的に類似しているオブジェクトを含む画像を取得するには、多くの実用的なユースケースがあります。いくつかの例には、学習したモデルのフォールスネガティブ/ポジティブなどの失敗の修正や、データセット内のクラスの不均衡の緩和が含まれます。対象を絞った選択タスクでは、ラベルのないデータの大規模なプールから関連データを見つける必要があります。この規模での手動マイニングは実行不可能です。さらに、OOIはしばしば小さく、画像領域の1％未満しか占有せず、閉塞され、雑然としたシーンで多くの意味的に異なるオブジェクトと共存します。既存のセマンティック画像検索方法は、多くの場合、より大きなサイズの地理的ランドマークのマイニングに焦点を当てているか、一般的なオブジェクトの画像をマイニングするために、類似のオブジェクトの画像/画像ペアなどの追加のラベル付きデータを必要とします。 DNN機能空間で高速で堅牢なテンプレートマッチングアルゴリズムを提案します。これは、ラベルのない大規模なデータプールからオブジェクトレベルで意味的に類似した画像を取得します。テンプレートとして使用するために、クエリ画像のOOI周辺の領域をDNNフィーチャスペースに投影します。これにより、追加のラベル付きデータを必要とせずに、メソッドがOOIのセマンティクスに焦点を合わせることができます。自動運転のコンテキストでは、オブジェクト検出器の障害事例をOOIとして使用することにより、ターゲットを選択するためのシステムを評価します。 220万枚の画像を含むラベルのない大規模なデータセットでその有効性を示し、小さいサイズのOOIを含む画像のマイニングで高い再現率を示します。この方法を、追加のラベル付きデータを必要としないよく知られたセマンティック画像検索方法と比較します。最後に、私たちの方法が柔軟であり、1つ以上の意味的に異なる同時発生するOOIを持つ画像をシームレスに取得することを示します。

Retrieving images with objects that are semantically similar to objects of interest (OOI) in a query image has many practical use cases. A few examples include fixing failures like false negatives/positives of a learned model or mitigating class imbalance in a dataset. The targeted selection task requires finding the relevant data from a large-scale pool of unlabeled data. Manual mining at this scale is infeasible. Further, the OOI are often small and occupy less than 1% of image area, are occluded, and co-exist with many semantically different objects in cluttered scenes. Existing semantic image retrieval methods often focus on mining for larger sized geographical landmarks, and/or require extra labeled data, such as images/image-pairs with similar objects, for mining images with generic objects. We propose a fast and robust template matching algorithm in the DNN feature space, that retrieves semantically similar images at the object-level from a large unlabeled pool of data. We project the region(s) around the OOI in the query image to the DNN feature space for use as the template. This enables our method to focus on the semantics of the OOI without requiring extra labeled data. In the context of autonomous driving, we evaluate our system for targeted selection by using failure cases of object detectors as OOI. We demonstrate its efficacy on a large unlabeled dataset with 2.2M images and show high recall in mining for images with small-sized OOI. We compare our method against a well-known semantic image retrieval method, which also does not require extra labeled data. Lastly, we show that our method is flexible and retrieves images with one or more semantically different co-occurring OOI seamlessly.

updated: Tue Jul 05 2022 02:32:34 GMT+0000 (UTC)

published: Tue Jul 05 2022 02:32:34 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト