A Delay Metric for Video Object Detection: What Average Precision Fails to Tell

Huizi Mao; Xiaodong Yang; William J. Dally

ビデオオブジェクト検出の遅延メトリック：平均精度が伝えることができないもの

平均精度（AP）は、画像およびビデオオブジェクト検出器の検出精度を評価するために広く使用されているメトリックです。このホワイトペーパーでは、ビデオからのオブジェクト検出を分析し、APだけではビデオオブジェクト検出の時間的性質をキャプチャするには不十分であることを指摘します。この問題に対処するために、検出遅延を測定および比較するための包括的なメトリックである平均遅延（AD）を提案します。遅延評価を容易にするために、ImageNet VIDのサブセットを慎重に選択します。これは、複雑な軌跡に重点を置いてImageNet VIDTと名付けています。 VIDTで広範囲の検出器を広範囲に評価することにより、ほとんどの方法で検出遅延が大幅に増加しますが、APを十分に保持できることを示しています。言い換えれば、APは、ビデオオブジェクト検出器の時間的特性を反映するほど感度が高くありません。特に自動車両認識などの遅延が重要なアプリケーションでは、ビデオオブジェクト検出方法を遅延メトリックで追加評価する必要があることが示唆されました。

Average precision (AP) is a widely used metric to evaluate detection accuracy of image and video object detectors. In this paper, we analyze object detection from videos and point out that AP alone is not sufficient to capture the temporal nature of video object detection. To tackle this problem, we propose a comprehensive metric, average delay (AD), to measure and compare detection delay. To facilitate delay evaluation, we carefully select a subset of ImageNet VID, which we name as ImageNet VIDT with an emphasis on complex trajectories. By extensively evaluating a wide range of detectors on VIDT, we show that most methods drastically increase the detection delay but still preserve AP well. In other words, AP is not sensitive enough to reflect the temporal characteristics of a video object detector. Our results suggest that video object detection methods should be additionally evaluated with a delay metric, particularly for latency-critical applications such as autonomous vehicle perception.

updated: Wed Nov 06 2019 22:50:02 GMT+0000 (UTC)

published: Sun Aug 18 2019 03:36:23 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト