ProgressiveSpinalNet architecture for FC layers

Praveen Chopra

FCレイヤー用のProgressiveSpinalNetアーキテクチャ

深層学習モデルでは、FC（完全に接続された）層が、前の層から学習した特徴に基づいて入力を分類するための最大の重要な役割を果たします。 FC層はパラメーターの数が最も多く、これらの多数のパラメーターを微調整すると、ほとんどの計算リソースを消費するため、このホワイトペーパーでは、パフォーマンスを向上させて、これらの多数のパラメーターを大幅に削減することを目的としています。その動機は、SpinalNetやその他の生物学的アーキテクチャから発想を得ています。提案されたアーキテクチャには、入力層と出力層の間に勾配ハイウェイがあり、これにより、深いネットワークで勾配が減少するという問題が解決されます。この場合、すべてのレイヤーが前のレイヤーからの入力とCNNレイヤーの出力を受け取り、このようにしてすべてのレイヤーが最後のレイヤーとの意思決定に貢献します。このアプローチにより、SpinalNetアーキテクチャよりも分類パフォーマンスが向上し、Caltech101、KMNIST、QMNIST、EMNISTなどの多くのデータセットでSOTAパフォーマンスが得られます。ソースコードはhttps://github.com/praveenchopra/ProgressiveSpinalNetで入手できます。

In deeplearning models the FC (fully connected) layer has biggest important role for classification of the input based on the learned features from previous layers. The FC layers has highest numbers of parameters and fine-tuning these large numbers of parameters, consumes most of the computational resources, so in this paper it is aimed to reduce these large numbers of parameters significantly with improved performance. The motivation is inspired from SpinalNet and other biological architecture. The proposed architecture has a gradient highway between input to output layers and this solves the problem of diminishing gradient in deep networks. In this all the layers receives the input from previous layers as well as the CNN layer output and this way all layers contribute in decision making with last layer. This approach has improved classification performance over the SpinalNet architecture and has SOTA performance on many datasets such as Caltech101, KMNIST, QMNIST and EMNIST. The source code is available at https://github.com/praveenchopra/ProgressiveSpinalNet.

updated: Sun Mar 21 2021 11:54:50 GMT+0000 (UTC)

published: Sun Mar 21 2021 11:54:50 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト