Event Neural Networks

Matthew Dutson; Mohit Gupta

イベントニューラルネットワーク

ビデオデータはしばしば反復的です。たとえば、隣接するフレームのコンテンツは通常、強く相関しています。このような繰り返しは、低レベルのピクセル値からテクスチャや高レベルのセマンティクスまで、複雑さの複数のレベルで発生します。イベントニューラルネットワーク（EvNets）を提案します。これは、この繰り返しを活用してビデオ推論タスクの計算を大幅に節約する新しいクラスのネットワークです。 EvNetsの明確な特徴は、各ニューロンに長期記憶を提供する状態変数があることです。これにより、カメラの動きが大きい場合でも低コストの推論が可能になります。事実上すべての従来のニューラルをEvNetに変換できることを示します。ポーズ認識、オブジェクト検出、オプティカルフロー、画像強調など、高レベルと低レベルの両方の視覚処理のためのいくつかの最先端のニューラルネットワークでの方法の有効性を示します。モデルの精度の低下を最小限に抑えながら、従来のネットワークと比較して、計算コストが最大で1桁（2〜20倍）削減されます。

Video data is often repetitive; for example, the content of adjacent frames is usually strongly correlated. Such repetition occurs at multiple levels of complexity, from low-level pixel values to textures and high-level semantics. We propose Event Neural Networks (EvNets), a novel class of networks that leverage this repetition to achieve considerable computation savings for video inference tasks. A defining characteristic of EvNets is that each neuron has state variables that provide it with long-term memory, which allows low-cost inference even in the presence of significant camera motion. We show that it is possible to transform virtually any conventional neural into an EvNet. We demonstrate the effectiveness of our method on several state-of-the-art neural networks for both high- and low-level visual processing, including pose recognition, object detection, optical flow, and image enhancement. We observe up to an order-of-magnitude reduction in computational costs (2-20x) as compared to conventional networks, with minimal reductions in model accuracy.

updated: Thu Dec 02 2021 00:08:48 GMT+0000 (UTC)

published: Thu Dec 02 2021 00:08:48 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト