An Adaptive Sampling and Edge Detection Approach for Encoding Static Images for Spiking Neural Networks

Peyton Chandarana; Junlin Ou; Ramtin Zand

スパイキングニューラルネットワークの静止画像をエンコードするための適応サンプリングおよびエッジ検出アプローチ

畳み込みニューラルネットワークを使用した画像分類の現在の最先端の方法は、多くの場合、遅延と消費電力の両方によって制約されます。これにより、これらの方法を採用できるデバイス、特に低電力エッジデバイスに制限が課せられます。スパイキングニューラルネットワーク（SNN）は、生物学的ニューロン通信プロセスからインスピレーションを得て、これらの遅延と電力の制約に対処することを目的とした第3世代の人工ニューラルネットワークと見なされています。ただし、画像などのデータをSNNに入力する前に、まずスパイク列にエンコードする必要があります。ここでは、エッジ検出を使用して静止画像を時間スパイク列にエンコードする方法と、SNNで使用するための適応信号サンプリング方法を提案します。エッジ検出プロセスは、最初に2D静止画像に対してキャニーエッジ検出を実行し、次に画像から信号への変換方法を使用してエッジ検出画像を2つのX信号とY信号に変換することで構成されます。適応シグナリングアプローチは、信号が十分な詳細を維持し、信号の急激な変化に敏感になるように信号をサンプリングすることで構成されます。次に、しきい値ベースの表現（TBR）やステップフォワード（SF）などの時間エンコードメカニズムを使用して、サンプリングされた信号をスパイク列に変換できます。提案された画像エンコーディングアプローチの効率と精度を最適化および評価するために、さまざまなエラーおよびインジケータメトリックを使用します。エッジ検出と適応時間エンコーディングメカニズムを使用して生成されたスパイクトレインからの元の信号と再構築された信号の比較結果は、エンコーディングに使用されている間、従来のSFおよびTBRエンコーディングと比較して平均二乗平均平方根誤差（RMSE）がそれぞれ18倍および7倍減少することを示していますMNISTデータセット。

Current state-of-the-art methods of image classification using convolutional neural networks are often constrained by both latency and power consumption. This places a limit on the devices, particularly low-power edge devices, that can employ these methods. Spiking neural networks (SNNs) are considered to be the third generation of artificial neural networks which aim to address these latency and power constraints by taking inspiration from biological neuronal communication processes. Before data such as images can be input into an SNN, however, they must be first encoded into spike trains. Herein, we propose a method for encoding static images into temporal spike trains using edge detection and an adaptive signal sampling method for use in SNNs. The edge detection process consists of first performing Canny edge detection on the 2D static images and then converting the edge detected images into two X and Y signals using an image-to-signal conversion method. The adaptive signaling approach consists of sampling the signals such that the signals maintain enough detail and are sensitive to abrupt changes in the signal. Temporal encoding mechanisms such as threshold-based representation (TBR) and step-forward (SF) are then able to be used to convert the sampled signals into spike trains. We use various error and indicator metrics to optimize and evaluate the efficiency and precision of the proposed image encoding approach. Comparison results between the original and reconstructed signals from spike trains generated using edge-detection and adaptive temporal encoding mechanism exhibit 18x and 7x reduction in average root mean square error (RMSE) compared to the conventional SF and TBR encoding, respectively, while used for encoding MNIST dataset.

updated: Tue Oct 19 2021 19:31:52 GMT+0000 (UTC)

published: Tue Oct 19 2021 19:31:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト