ExpandNets: Linear Over-parameterization to Train Compact Convolutional Networks

Shuxuan Guo; Jose M. Alvarez; Mathieu Salzmann

ExpandNets：コンパクト畳み込みネットワークをトレーニングするための線形オーバーパラメーター化

特定のコンパクトネットワークをトレーニングするためのアプローチを紹介します。この目的のために、通常、ニューラルネットワークの最適化と一般化の両方を改善する過剰パラメーター化を活用します。具体的には、非線形性を追加することなく、コンパクトネットワークの各線形層を複数の連続する線形層に拡張することを提案します。そのため、結果として得られる拡張ネットワーク、つまりExpandNetは、推論時に代数的にコンパクトなネットワークに縮小することができます。特に、2つの畳み込み拡張戦略を紹介し、画像分類、オブジェクト検出、セマンティックセグメンテーションなどのいくつかのタスクでの利点を示します。私たちの実験で証明されているように、私たちのアプローチは、コンパクトなネットワークを最初からトレーニングすることと、教師から知識の蒸留を実行することの両方よりも優れています。さらに、線形の過剰パラメーター化は、トレーニング中の勾配の混乱を経験的に減らし、ネットワークの一般化を改善します。

We introduce an approach to training a given compact network. To this end, we leverage over-parameterization, which typically improves both neural network optimization and generalization. Specifically, we propose to expand each linear layer of the compact network into multiple consecutive linear layers, without adding any nonlinearity. As such, the resulting expanded network, or ExpandNet, can be contracted back to the compact one algebraically at inference. In particular, we introduce two convolutional expansion strategies and demonstrate their benefits on several tasks, including image classification, object detection, and semantic segmentation. As evidenced by our experiments, our approach outperforms both training the compact network from scratch and performing knowledge distillation from a teacher. Furthermore, our linear over-parameterization empirically reduces gradient confusion during training and improves the network generalization.

updated: Wed Apr 14 2021 11:55:22 GMT+0000 (UTC)

published: Mon Nov 26 2018 16:40:24 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト