A Coarse to Fine Framework for Object Detection in High Resolution Image

Jinyan Liu; Jie Chen

高解像度画像でのオブジェクト検出のための粗いフレームワークから細かいフレームワークへ

オブジェクト検出は、画像内のオブジェクトの位置を特定して分類することを目的とした、コンピュータービジョンの基本的な問題です。現在のデバイスは非常に高解像度の画像を簡単に取得できますが、オブジェクト検出の現在のアプローチでは、高解像度画像における小さなオブジェクトや大規模な分散の問題を検出することはほとんど考慮されていません。このホワイトペーパーでは、高解像度画像の計算コストを削減しながら、特に小さなオブジェクトと大規模な分散シーンのオブジェクト検出の精度を向上させる、シンプルで効率的なアプローチを紹介します。画像が適切にダウンサンプリングされている場合、全体的な検出精度が低下しているが、再現率が大幅に低下していないことを観察することに触発されました。さらに、軽量の検出器を使用していても、高解像度の画像を入力することで、小さなオブジェクトをより適切に検出できます。高解像度画像内の大きなオブジェクトの精度を確保しながら、小さなオブジェクトを検出するパフォーマンスを向上させるために、クラスターベースの粗いオブジェクトから細かいオブジェクト検出フレームワークを提案します。第一段階として、ダウンサンプリングされた画像に対して粗検出を行い、高解像度画像に対して軽量検出器による小さなオブジェクトの中心位置特定を行い、粗検出および中心位置特定結果によるクラスター領域生成法に基づいて画像チップを取得します。さらに、細かい検出のためにチップを第 2 段階の検出器に送ります。最後に、粗い検出結果と細かい検出結果をマージします。私たちのアプローチは、オブジェクトのまばらさと高解像度画像の情報をうまく利用できるため、検出がより効率的になります。実験結果は、提案されたアプローチが他の最先端の検出器と比較して有望な性能を達成することを示しています。

Object detection is a fundamental problem in computer vision, aiming at locating and classifying objects in image. Although current devices can easily take very high-resolution images, current approaches of object detection seldom consider detecting tiny object or the large scale variance problem in high resolution images. In this paper, we introduce a simple yet efficient approach that improves accuracy of object detection especially for small objects and large scale variance scene while reducing the computational cost in high resolution image. Inspired by observing that overall detection accuracy is reduced if the image is properly down-sampled but the recall rate is not significantly reduced. Besides, small objects can be better detected by inputting high-resolution images even if using lightweight detector. We propose a cluster-based coarse-to-fine object detection framework to enhance the performance for detecting small objects while ensure the accuracy of large objects in high-resolution images. For the first stage, we perform coarse detection on the down-sampled image and center localization of small objects by lightweight detector on high-resolution image, and then obtains image chips based on cluster region generation method by coarse detection and center localization results, and further sends chips to the second stage detector for fine detection. Finally, we merge the coarse detection and fine detection results. Our approach can make good use of the sparsity of the objects and the information in high-resolution image, thereby making the detection more efficient. Experiment results show that our proposed approach achieves promising performance compared with other state-of-the-art detectors.

updated: Thu Mar 02 2023 13:04:33 GMT+0000 (UTC)

published: Thu Mar 02 2023 13:04:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト