HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization

Patrick Glandorf; Timo Kaiser; Bodo Rosenhahn

HyperSparse ニューラルネットワーク: 適応正則化による探索から悪用への移行

スパースニューラルネットワークは、リソース効率の高い機械学習アプリケーションを開発する上で重要な要素です。我々は、密ネットワークを疎ネットワークに圧縮するための、新しく強力な疎学習手法 Adaptive Regularized Training (ART) を提案します。トレーニング中にモデルの重みの数を減らすために一般的に使用されるバイナリマスクの代わりに、重みの正則化を増加させながら反復的に重みを本質的にゼロ近くまで縮小します。私たちの方法では、事前トレーニングされたモデルの知識が最大の大きさの重みに圧縮されます。したがって、重み探索の能力を維持しながら最高の重みを利用する、HyperSparse という新しい正則化損失を導入します。 CIFAR と TinyImageNet に関する広範な実験により、特に最大 99.8 パーセントのモデルスパース性という非常に高いスパース性領域において、私たちの方法が他のスパース化方法と比較して顕著なパフォーマンスの向上につながることが示されています。追加の調査により、大きな重みでエンコードされたパターンについての新たな洞察が得られます。

Sparse neural networks are a key factor in developing resource-efficient machine learning applications. We propose the novel and powerful sparse learning method Adaptive Regularized Training (ART) to compress dense into sparse networks. Instead of the commonly used binary mask during training to reduce the number of model weights, we inherently shrink weights close to zero in an iterative manner with increasing weight regularization. Our method compresses the pre-trained model knowledge into the weights of highest magnitude. Therefore, we introduce a novel regularization loss named HyperSparse that exploits the highest weights while conserving the ability of weight exploration. Extensive experiments on CIFAR and TinyImageNet show that our method leads to notable performance gains compared to other sparsification methods, especially in extremely high sparsity regimes up to 99.8 percent model sparsity. Additional investigations provide new insights into the patterns that are encoded in weights with high magnitudes.

updated: Wed Aug 16 2023 06:52:21 GMT+0000 (UTC)

published: Mon Aug 14 2023 14:18:11 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト