Query-based Hard-Image Retrieval for Object Detection at Test Time

Edward Ayers; Jonathan Sadeghi; John Redford; Romain Mueller; Puneet K. Dokania

テスト時のオブジェクト検出のためのクエリベースのハードイメージ検索

オブジェクト検出器のパフォーマンスが不十分である可能性が高い画像を見つけることによって、オブジェクト検出器のエラー動作をキャプチャすることに長年の関心が寄せられています。自動運転などの実際のアプリケーションでは、検出性能の単純な要件を超えて、潜在的な障害を特徴付けることも重要です。たとえば、自車両の近くにいる歩行者の検出を逃した場合は、一般に、遠くにある車の検出を逃した場合よりも、より詳細な検査が必要になります。テスト時にこのような潜在的な障害を予測する問題は、文献ではほとんど見落とされており、検出の不確実性に基づく従来のアプローチは、このようなエラーのきめ細かな特徴付けにとらわれないという点で不十分です。この作業では、「ハード」画像を見つける問題をクエリベースのハード画像検索タスクとして再定式化することを提案します。クエリは「ハードネス」の特定の定義であり、このタスクを解決できるシンプルで直感的な方法を提供します。クエリの大規模なファミリ。私たちの方法は完全に事後的であり、グラウンドトゥルースアノテーションを必要とせず、検出器の選択から独立しており、グラウンドトゥルースの代わりに単純な確率モデルを使用する効率的なモンテカルロ推定に依存しています。ラベル付けされたデータがなくても、特定の検出器のハード画像を確実に識別できるさまざまなクエリに正常に適用できることを実験的に示しています。広く使用されている RetinaNet、Faster-RCNN、Mask-RCNN、および Cascade Mask-RCNN オブジェクト検出器を使用して、ランキングおよび分類タスクの結果を提供します。

There is a longstanding interest in capturing the error behaviour of object detectors by finding images where their performance is likely to be unsatisfactory. In real-world applications such as autonomous driving, it is also crucial to characterise potential failures beyond simple requirements of detection performance. For example, a missed detection of a pedestrian close to an ego vehicle will generally require closer inspection than a missed detection of a car in the distance. The problem of predicting such potential failures at test time has largely been overlooked in the literature and conventional approaches based on detection uncertainty fall short in that they are agnostic to such fine-grained characterisation of errors. In this work, we propose to reformulate the problem of finding "hard" images as a query-based hard image retrieval task, where queries are specific definitions of "hardness", and offer a simple and intuitive method that can solve this task for a large family of queries. Our method is entirely post-hoc, does not require ground-truth annotations, is independent of the choice of a detector, and relies on an efficient Monte Carlo estimation that uses a simple stochastic model in place of the ground-truth. We show experimentally that it can be applied successfully to a wide variety of queries for which it can reliably identify hard images for a given detector without any labelled data. We provide results on ranking and classification tasks using the widely used RetinaNet, Faster-RCNN, Mask-RCNN, and Cascade Mask-RCNN object detectors.

updated: Fri Sep 23 2022 12:33:31 GMT+0000 (UTC)

published: Fri Sep 23 2022 12:33:31 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト