Eagle: End-to-end Deep Reinforcement Learning based Autonomous Control of PTZ Cameras

Sandeep Singh Sandha; Bharathan Balaji; Luis Garcia; Mani Srivastava

Eagle: エンドツーエンドの深層強化学習に基づく PTZ カメラの自律制御

パン・チルト・ズーム（PTZ）カメラの自律制御のための既存のアプローチでは、物体の検出と位置特定が PTZ メカニズムの制御とは別に実行される複数のステージが使用されます。これらのアプローチでは、手動でラベルを付ける必要があり、多段階の情報フロー全体にエラーが伝播するため、パフォーマンスのボトルネックに悩まされます。また、オブジェクト検出ニューラルネットワークのサイズが大きいため、リソースに制約のあるデバイスにリアルタイムで展開する場合、従来のソリューションは実行不可能になります。 Eagle と呼ばれるエンドツーエンドの深層強化学習 (RL) ソリューションを提示して、PTZ カメラを制御するための入力として画像を直接受け取るニューラルネットワークポリシーをトレーニングします。強化学習のトレーニングは、ラベリング作業、ランタイム環境の確率論、壊れやすい実験設定のために、現実の世界では扱いにくいものです。 PTZ カメラ制御ポリシーのトレーニングと評価のためのフォトリアリスティックシミュレーションフレームワークを紹介します。 Eagle は、関心のあるオブジェクトを高解像度でキャプチャされた画像の中心近くに維持することにより、優れたカメラ制御パフォーマンスを実現し、最先端技術よりも最大 17% 長い追跡時間を実現します。 Eagle ポリシーは軽量 (Yolo5s の 90 分の 1 のパラメーター) であり、Raspberry PI (33 FPS) や Jetson Nano (38 FPS) などの組み込みカメラプラットフォームで実行できるため、リソースに制約のある環境でのリアルタイム PTZ 追跡が容易になります。ドメインのランダム化により、当社のシミュレーターでトレーニングされた Eagle ポリシーを実際のシナリオに直接移すことができます。

Existing approaches for autonomous control of pan-tilt-zoom (PTZ) cameras use multiple stages where object detection and localization are performed separately from the control of the PTZ mechanisms. These approaches require manual labels and suffer from performance bottlenecks due to error propagation across the multi-stage flow of information. The large size of object detection neural networks also makes prior solutions infeasible for real-time deployment in resource-constrained devices. We present an end-to-end deep reinforcement learning (RL) solution called Eagle to train a neural network policy that directly takes images as input to control the PTZ camera. Training reinforcement learning is cumbersome in the real world due to labeling effort, runtime environment stochasticity, and fragile experimental setups. We introduce a photo-realistic simulation framework for training and evaluation of PTZ camera control policies. Eagle achieves superior camera control performance by maintaining the object of interest close to the center of captured images at high resolution and has up to 17% more tracking duration than the state-of-the-art. Eagle policies are lightweight (90x fewer parameters than Yolo5s) and can run on embedded camera platforms such as Raspberry PI (33 FPS) and Jetson Nano (38 FPS), facilitating real-time PTZ tracking for resource-constrained environments. With domain randomization, Eagle policies trained in our simulator can be transferred directly to real-world scenarios.

updated: Mon Apr 10 2023 02:41:56 GMT+0000 (UTC)

published: Mon Apr 10 2023 02:41:56 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト