Collaborative Multi-Agent Video Fast-Forwarding

Shuyue Lan; Zhilu Wang; Ermin Wei; Amit K. Roy-Chowdhury; Qi Zhu

マルチエージェントの共同ビデオ早送り

マルチエージェントアプリケーションは最近非常に人気が高まっています。多くのコンピュータビジョンタスクでは、カメラを備えたロボットのチームなどのエージェントのネットワークが協力して動作して環境を認識し、効率的かつ正確な状況認識を実現できます。ただし、これらのエージェントの計算、通信、ストレージリソースは限られていることがよくあります。したがって、マルチエージェントシステムを導入する際には、環境を正確に認識しながらリソース消費を削減することが重要な目標になります。この目標を達成するために、マルチエージェントシステムにおけるさまざまなカメラビュー間の重複を特定して活用し、冗長/重要でないビデオフレームの処理、送信、および保存を削減します。具体的には、分散設定と集中設定でそれぞれ 2 つの共同マルチエージェントビデオ早送りフレームワークを開発しました。これらのフレームワークでは、個々のエージェントは、強化学習による複数の戦略に基づいて、調整可能なペースでビデオフレームを選択的に処理またはスキップできます。次に、複数のエージェントが、1) 接続された近隣者間の通信とコンセンサスを確立することでエージェントの高速転送戦略を定期的に更新する DMVF と呼ばれるコンセンサスベースの分散フレームワーク、または 2) 中央のコントローラは、収集されたデータに基づいてエージェントの早送り戦略を決定します。私たちは、広範なシミュレーションと TCP 通信を備えた組み込みプラットフォームへの展開を通じて、実世界の監視ビデオデータセット VideoWeb と新しいシミュレートされた運転データセット CarlaSim 上で提案したフレームワークの有効性と効率性を実証します。文献にある他のアプローチと比較して、私たちのフレームワークは、各エージェントで処理されるフレーム数を大幅に削減しながら、重要なフレームをより適切にカバーできることを示します。

Multi-agent applications have recently gained significant popularity. In many computer vision tasks, a network of agents, such as a team of robots with cameras, could work collaboratively to perceive the environment for efficient and accurate situation awareness. However, these agents often have limited computation, communication, and storage resources. Thus, reducing resource consumption while still providing an accurate perception of the environment becomes an important goal when deploying multi-agent systems. To achieve this goal, we identify and leverage the overlap among different camera views in multi-agent systems for reducing the processing, transmission and storage of redundant/unimportant video frames. Specifically, we have developed two collaborative multi-agent video fast-forwarding frameworks in distributed and centralized settings, respectively. In these frameworks, each individual agent can selectively process or skip video frames at adjustable paces based on multiple strategies via reinforcement learning. Multiple agents then collaboratively sense the environment via either 1) a consensus-based distributed framework called DMVF that periodically updates the fast-forwarding strategies of agents by establishing communication and consensus among connected neighbors, or 2) a centralized framework called MFFNet that utilizes a central controller to decide the fast-forwarding strategies for agents based on collected data. We demonstrate the efficacy and efficiency of our proposed frameworks on a real-world surveillance video dataset VideoWeb and a new simulated driving dataset CarlaSim, through extensive simulations and deployment on an embedded platform with TCP communication. We show that compared with other approaches in the literature, our frameworks achieve better coverage of important frames, while significantly reducing the number of frames processed at each agent.

updated: Sat May 27 2023 20:12:19 GMT+0000 (UTC)

published: Sat May 27 2023 20:12:19 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト