Large-Scale Person Detection and Localization using Overhead Fisheye Cameras

Lu Yang; Liulei Li; Xueshi Xin; Yifan Sun; Qing Song; Wenguan Wang

頭上魚眼カメラを使用した大規模な人物の検出と位置特定

位置特定は日常生活に幅広く応用できます。パースペクティブカメラで撮影された観光客の写真の位置を特定することに専念する既存の取り組みの代わりに、この記事では、頭上の魚眼カメラを使用して人物の位置を特定するソリューションを考案することに焦点を当てます。このようなソリューションは、広い視野 (FOV)、低コスト、耐オクルージョン、および非攻撃的な作業モード (人間がカメラを持ち運ぶ必要がない) において有利です。しかし、データが不足しているため、関連する研究はほとんどありません。このエキサイティングな分野の研究を促進するために、人物の検出と位置特定のための初の大規模な頭上魚眼データセットである LOAF を紹介します。 LOAF は多くの重要な機能を備えて構築されています。たとえば、i) データはシーン、人間のポーズ、密度、位置の豊富な多様性をカバーします。 ii) 現在、最も多くの注釈付き歩行者、つまりグラウンドトゥルースの位置情報を含む 457K の境界ボックスが含まれています。 iii) 位置決めの課題に完全に対処するために、ボディボックスは半径方向に調整されているとラベル付けされています。位置特定にアプローチするために、魚眼人物検出ネットワークを構築します。これは、回転等変トレーニング戦略によって魚眼歪みを利用し、半径方向に位置合わせされた人物ボックスをエンドツーエンドで予測します。次に、魚眼モデルとカメラの高度データの数値解法により、検出された人物の実際の位置が計算されます。 LOAF に関する広範な実験により、以前の方法に対する当社の魚眼検出器の優位性が検証され、当社の魚眼測位ソリューション全体が 0.1 秒以内に 0.5 m の精度で FOV 内のすべての人の位置を特定できることが示されました。

Location determination finds wide applications in daily life. Instead of existing efforts devoted to localizing tourist photos captured by perspective cameras, in this article, we focus on devising person positioning solutions using overhead fisheye cameras. Such solutions are advantageous in large field of view (FOV), low cost, anti-occlusion, and unaggressive work mode (without the necessity of cameras carried by persons). However, related studies are quite scarce, due to the paucity of data. To stimulate research in this exciting area, we present LOAF, the first large-scale overhead fisheye dataset for person detection and localization. LOAF is built with many essential features, e.g., i) the data cover abundant diversities in scenes, human pose, density, and location; ii) it contains currently the largest number of annotated pedestrian, i.e., 457K bounding boxes with groundtruth location information; iii) the body-boxes are labeled as radius-aligned so as to fully address the positioning challenge. To approach localization, we build a fisheye person detection network, which exploits the fisheye distortions by a rotation-equivariant training strategy and predict radius-aligned human boxes end-to-end. Then, the actual locations of the detected persons are calculated by a numerical solution on the fisheye model and camera altitude data. Extensive experiments on LOAF validate the superiority of our fisheye detector w.r.t. previous methods, and show that our whole fisheye positioning solution is able to locate all persons in FOV with an accuracy of 0.5 m, within 0.1 s.

updated: Mon Jul 17 2023 05:36:01 GMT+0000 (UTC)

published: Mon Jul 17 2023 05:36:01 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト