Elixir: A system to enhance data quality for multiple analytics on a video stream

Sibendu Paul; Kunal Rao; Giuseppe Coviello; Murugan Sankaradas; Oliver Po; Y. Charlie Hu; Srimat T. Chakradhar

Elixir: ビデオストリームの複数の分析のデータ品質を向上させるシステム

IoT センサー、特にビデオカメラは、小売、医療、安全とセキュリティ、輸送、製造など、さまざまな分野でさまざまなコンピュータービジョンタスクを実行するために、世界中に広く展開されています。すべてのカメラからのビデオフィードから、分析ユニット (AU) と呼ばれる複数のビデオ分析タスクを実行することが望ましいです。このホワイトペーパーでは、最初に、マルチ AU 設定では、カメラ設定を変更すると、さまざまな AU のパフォーマンスに不均衡な影響があることを示します。特に、ある AU の最適な設定は、別の AU のパフォーマンスを大幅に低下させる可能性があり、さらに環境条件が変化すると、別の AU への影響も異なります。次に、ビデオストリームの複数の分析用にビデオストリームの品質を向上させるシステムである Elixir を紹介します。 Elixir は Multi-Objective Reinforcement Learning (MORL) を活用しており、RL エージェントはさまざまな AU の目的に対応し、カメラ設定を調整してすべての AU のパフォーマンスを同時に向上させます。 MORL で複数の目的を定義するために、個々の AU ごとに新しい AU 固有の品質推定値を開発します。 Elixir と 2 つのベースラインアプローチをそれぞれ実行する (大企業の駐車場を見下ろす) 3 台のカメラを並べて配置したテストベッドで、Elixir を実世界の実験で評価します。 Elixir は、デフォルト設定よりも 7.1% (22,068) および 5.0% (15,731) 多い車、94% (551) および 72% (478) 多い顔、670.4% (4975) および 158.6% (3507) 多い人を正しく検出します。および時分割アプローチ、それぞれ。また、115 個のナンバープレートを検出します。これは、タイムシェアリングアプローチ (7 個) やデフォルト設定 (0 個) よりもはるかに多くなっています。

IoT sensors, especially video cameras, are ubiquitously deployed around the world to perform a variety of computer vision tasks in several verticals including retail, healthcare, safety and security, transportation, manufacturing, etc. To amortize their high deployment effort and cost, it is desirable to perform multiple video analytics tasks, which we refer to as Analytical Units (AUs), off the video feed coming out of every camera. In this paper, we first show that in a multi-AU setting, changing the camera setting has disproportionate impact on different AUs performance. In particular, the optimal setting for one AU may severely degrade the performance for another AU, and further the impact on different AUs varies as the environmental condition changes. We then present Elixir, a system to enhance the video stream quality for multiple analytics on a video stream. Elixir leverages Multi-Objective Reinforcement Learning (MORL), where the RL agent caters to the objectives from different AUs and adjusts the camera setting to simultaneously enhance the performance of all AUs. To define the multiple objectives in MORL, we develop new AU-specific quality estimator values for each individual AU. We evaluate Elixir through real-world experiments on a testbed with three cameras deployed next to each other (overlooking a large enterprise parking lot) running Elixir and two baseline approaches, respectively. Elixir correctly detects 7.1% (22,068) and 5.0% (15,731) more cars, 94% (551) and 72% (478) more faces, and 670.4% (4975) and 158.6% (3507) more persons than the default-setting and time-sharing approaches, respectively. It also detects 115 license plates, far more than the time-sharing approach (7) and the default setting (0).

updated: Thu Dec 08 2022 04:04:58 GMT+0000 (UTC)

published: Thu Dec 08 2022 04:04:58 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト