Revisiting Batch Normalization for Training Low-latency Deep Spiking Neural Networks from Scratch

Youngeun Kim; Priyadarshini Panda

低遅延のディープスパイキングニューラルネットワークをゼロからトレーニングするためのバッチ正規化の再検討

スパイキングニューラルネットワーク（SNN）は、スパース、非同期、バイナリイベント（またはスパイク）駆動の処理により、ディープラーニングの代替として最近登場しました。これにより、ニューロモーフィックハードウェアでエネルギー効率が大幅に向上します。 SNNを作成するためのほとんどの既存のアプローチは、事前にトレーニングされた人工ニューラルネットワーク（ANN）から重みを変換するか、代理勾配バックプロパゲーションを使用してSNNを直接トレーニングします。それぞれのアプローチには、長所と短所があります。 ANNからSNNへの変換方法では、推論に少なくとも数百のタイムステップが必要であり、競争力のある精度が得られ、その結果、エネルギーの節約が減少します。代理勾配を使用してSNNを最初からトレーニングすると、待ち時間またはタイムステップの総数が減少しますが、トレーニングが遅く/問題が発生し、収束の問題が発生します。したがって、SNNをトレーニングする後者のアプローチは、単純なデータセット上の浅いネットワークに限定されていました。 SNNでのこのトレーニングの問題に対処するために、バッチ正規化を再検討し、時間による時間バッチ正規化（BNTT）手法を提案します。これまでのほとんどの以前のSNN作業は、一時的なSNNのトレーニングには効果がないと見なしてバッチ正規化を無視していました。以前の作品とは異なり、提案されたBNTTは、時間軸に沿ってBNTTレイヤーのパラメーターを分離し、スパイクの時間的ダイナミクスをキャプチャします。 BNTTの時間的に進化する学習可能なパラメーターにより、ニューロンはさまざまな時間ステップでスパイクレートを制御できるため、低遅延で低エネルギーのトレーニングを最初から行うことができます。 CIFAR-10、CIFAR-100、Tiny-ImageNet、およびイベント駆動型DVS-CIFAR10データセットで実験を行います。 BNTTを使用すると、わずか25〜30のタイムステップで複雑なデータセットを使用して、深いSNNアーキテクチャを最初からトレーニングできます。また、BNTTのパラメーターの分布を使用して、推論時の待ち時間を短縮し、エネルギー効率をさらに向上させる早期終了アルゴリズムを提案します。

Spiking Neural Networks (SNNs) have recently emerged as an alternative to deep learning owing to sparse, asynchronous and binary event (or spike) driven processing, that can yield huge energy efficiency benefits on neuromorphic hardware. Most existing approaches to create SNNs either convert the weights from pre-trained Artificial Neural Networks (ANNs) or directly train SNNs with surrogate gradient backpropagation. Each approach presents its pros and cons. The ANN-to-SNN conversion method requires at least hundreds of time-steps for inference to yield competitive accuracy that in turn reduces the energy savings. Training SNNs with surrogate gradients from scratch reduces the latency or total number of time-steps, but the training becomes slow/problematic and has convergence issues. Thus, the latter approach of training SNNs has been limited to shallow networks on simple datasets. To address this training issue in SNNs, we revisit batch normalization and propose a temporal Batch Normalization Through Time (BNTT) technique. Most prior SNN works till now have disregarded batch normalization deeming it ineffective for training temporal SNNs. Different from previous works, our proposed BNTT decouples the parameters in a BNTT layer along the time axis to capture the temporal dynamics of spikes. The temporally evolving learnable parameters in BNTT allow a neuron to control its spike rate through different time-steps, enabling low-latency and low-energy training from scratch. We conduct experiments on CIFAR-10, CIFAR-100, Tiny-ImageNet and event-driven DVS-CIFAR10 datasets. BNTT allows us to train deep SNN architectures from scratch, for the first time, on complex datasets with just few 25-30 time-steps. We also propose an early exit algorithm using the distribution of parameters in BNTT to reduce the latency at inference, that further improves the energy-efficiency.

updated: Tue Oct 27 2020 21:42:42 GMT+0000 (UTC)

published: Mon Oct 05 2020 00:49:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト