Revisiting Batch Normalization for Training Low-latency Deep Spiking Neural Networks from Scratch

Youngeun Kim; Priyadarshini Panda

低遅延のディープスパイキングニューラルネットワークをゼロからトレーニングするためのバッチ正規化の再検討

スパイキングニューラルネットワーク（SNN）は、スパース、非同期、バイナリイベント（またはスパイク）駆動の処理により、ディープラーニングの代替として最近登場しました。これにより、ニューロモーフィックハードウェアでエネルギー効率が大幅に向上します。ただし、高精度で低遅延のSNNを最初からトレーニングすると、スパイキングニューロンの微分不可能な性質に悩まされます。 SNNでのこのトレーニングの問題に対処するために、バッチ正規化を再検討し、時間による時間バッチ正規化（BNTT）手法を提案します。これまでのほとんどの以前のSNN作業は、一時的なSNNのトレーニングには効果がないと見なしてバッチ正規化を無視していました。以前の作品とは異なり、提案されたBNTTは、時間軸に沿ってBNTTレイヤーのパラメーターを分離し、スパイクの時間的ダイナミクスをキャプチャします。 BNTTの時間的に進化する学習可能なパラメーターにより、ニューロンはさまざまな時間ステップでスパイクレートを制御できるため、低遅延で低エネルギーのトレーニングを最初から行うことができます。 CIFAR-10、CIFAR-100、Tiny-ImageNet、およびイベント駆動型DVS-CIFAR10データセットで実験を行います。 BNTTを使用すると、わずか25〜30のタイムステップで複雑なデータセットを使用して、深いSNNアーキテクチャを最初からトレーニングできます。また、BNTTのパラメーターの分布を使用して、推論時の待ち時間を短縮し、エネルギー効率をさらに向上させる早期終了アルゴリズムを提案します。

Spiking Neural Networks (SNNs) have recently emerged as an alternative to deep learning owing to sparse, asynchronous and binary event (or spike) driven processing, that can yield huge energy efficiency benefits on neuromorphic hardware. However, training high-accuracy and low-latency SNNs from scratch suffers from non-differentiable nature of a spiking neuron. To address this training issue in SNNs, we revisit batch normalization and propose a temporal Batch Normalization Through Time (BNTT) technique. Most prior SNN works till now have disregarded batch normalization deeming it ineffective for training temporal SNNs. Different from previous works, our proposed BNTT decouples the parameters in a BNTT layer along the time axis to capture the temporal dynamics of spikes. The temporally evolving learnable parameters in BNTT allow a neuron to control its spike rate through different time-steps, enabling low-latency and low-energy training from scratch. We conduct experiments on CIFAR-10, CIFAR-100, Tiny-ImageNet and event-driven DVS-CIFAR10 datasets. BNTT allows us to train deep SNN architectures from scratch, for the first time, on complex datasets with just few 25-30 time-steps. We also propose an early exit algorithm using the distribution of parameters in BNTT to reduce the latency at inference, that further improves the energy-efficiency.

updated: Wed Nov 10 2021 21:23:44 GMT+0000 (UTC)

published: Mon Oct 05 2020 00:49:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト