Training for temporal sparsity in deep neural networks, application in video processing

Amirreza Yousefzadeh; Manolis Sifalakis

ディープニューラルネットワークの時間的スパース性のトレーニング、ビデオ処理への応用

アクティベーションスパース性は、スパース性を意識したニューラルネットワークアクセラレータの計算効率とリソース使用率を向上させます。 DNNの主な操作は、内積を計算するための重み付きの積和（MAC）であるため、2つのオペランドの（少なくとも）1つがゼロである操作をスキップすると、待ち時間と電力の点で推論がより効率的になります。アクティベーションの空間的スパース化はDNNの文献で人気のあるトピックであり、DNNにバイアスをかけるためのいくつかの方法がすでに確立されています。一方、時間的スパース性は、ニューロモルフィック処理がハードウェア効率のために利用する、バイオインスパイアードスパイキングニューラルネットワーク（SNN）の固有の機能です。時空間スパース性の導入と活用は、DNNの文献ではあまり検討されていないトピックですが、静的信号処理からよりストリーミング信号処理に移行するというDNNの傾向と完全に共鳴しています。この目標に向けて、このペーパーでは、トレーニング中のアクティベーションの一時的なスパース性を促進することを唯一の目的とする新しいDNNレイヤー（デルタアクティベーションレイヤーと呼ばれる）を紹介します。デルタアクティベーションレイヤーは、ハードウェアでスパーステンソル乗算を実行するときに利用される空間アクティベーションスパース性に時間的スパース性をキャストします。トレーニング中にデルタ推論と「通常の」空間スパース化ヒューリスティックを採用することにより、結果のモデルは、空間だけでなく時間的活性化スパース性（特定の入力データ分布に対して）を活用することを学習します。バニラトレーニング中または改良段階のいずれかで、デルタアクティベーションレイヤーを使用できます。標準のTensoflow-Kerasライブラリの拡張としてDeltaActivation Layerを実装し、Human Action Recognition（UCF101）データセットでディープニューラルネットワークをトレーニングするために適用しました。アクティベーションのスパース性がほぼ3倍に向上し、トレーニングを長くするとモデルの精度が回復可能に失われることを報告します。

Activation sparsity improves compute efficiency and resource utilization in sparsity-aware neural network accelerators. As the predominant operation in DNNs is multiply-accumulate (MAC) of activations with weights to compute inner products, skipping operations where (at least) one of the two operands is zero can make inference more efficient in terms of latency and power. Spatial sparsification of activations is a popular topic in DNN literature and several methods have already been established to bias a DNN for it. On the other hand, temporal sparsity is an inherent feature of bio-inspired spiking neural networks (SNNs), which neuromorphic processing exploits for hardware efficiency. Introducing and exploiting spatio-temporal sparsity, is a topic much less explored in DNN literature, but in perfect resonance with the trend in DNN, to shift from static signal processing to more streaming signal processing. Towards this goal, in this paper we introduce a new DNN layer (called Delta Activation Layer), whose sole purpose is to promote temporal sparsity of activations during training. A Delta Activation Layer casts temporal sparsity into spatial activation sparsity to be exploited when performing sparse tensor multiplications in hardware. By employing delta inference and ``the usual'' spatial sparsification heuristics during training, the resulting model learns to exploit not only spatial but also temporal activation sparsity (for a given input data distribution). One may use the Delta Activation Layer either during vanilla training or during a refinement phase. We have implemented Delta Activation Layer as an extension of the standard Tensoflow-Keras library, and applied it to train deep neural networks on the Human Action Recognition (UCF101) dataset. We report an almost 3x improvement of activation sparsity, with recoverable loss of model accuracy after longer training.

updated: Thu Jul 15 2021 13:17:11 GMT+0000 (UTC)

published: Thu Jul 15 2021 13:17:11 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト