Weight Evolution: Improving Deep Neural Networks Training through Evolving Inferior Weight Values

Zhenquan Lin; Kailing Guo; Xiaofen Xing; Xiangmin Xu

重みの進化：劣った重み値の進化によるディープニューラルネットワークトレーニングの改善

良好なパフォーマンスを得るために、畳み込みニューラルネットワークは通常過剰にパラメーター化されています。この現象は、2つの興味深いトピックを刺激しました。圧縮のために重要でない重みを取り除くことと、ネットワーク機能を最大限に活用するために重要でない重みを再アクティブ化することです。ただし、現在の重みの再アクティブ化方法では、通常、フィルター全体が再アクティブ化されるため、十分な精度が得られない場合があります。歴史を振り返ると、フィルタープルーニングの繁栄は、主にハードウェア実装への親しみやすさによるものですが、より細かい構造レベル、つまり重み要素でのプルーニングは、通常、ネットワークパフォーマンスの向上につながります。この論文では、重量要素の再活性化の問題を研究します。進化に動機付けられて、重要でないフィルターを選択し、遺伝子のクロスオーバーのように重要なフィルターの重要な要素と組み合わせて重要でない要素を更新し、より良い子孫を生成します。提案された方法は、重み進化（WE）と呼ばれます。 WEは主に4つの戦略で構成されています。グローバル選択戦略とローカル選択戦略を提案し、それらを組み合わせて重要でないフィルターを見つけます。一致した重要なフィルターを見つけるためにフォワードマッチング戦略が提案され、重要でないフィルターを更新するために重要なフィルターの重要な要素を利用するためにクロスオーバー戦略が提案されます。 WEは、既存のネットワークアーキテクチャへのプラグインです。包括的な実験は、WEが他の再活性化手法やプラグイントレーニング手法よりも、典型的な畳み込みニューラルネットワーク、特に軽量ネットワークよりも優れていることを示しています。私たちのコードはhttps://github.com/BZQLin/Weight-evolutionで入手できます。

To obtain good performance, convolutional neural networks are usually over-parameterized. This phenomenon has stimulated two interesting topics: pruning the unimportant weights for compression and reactivating the unimportant weights to make full use of network capability. However, current weight reactivation methods usually reactivate the entire filters, which may not be precise enough. Looking back in history, the prosperity of filter pruning is mainly due to its friendliness to hardware implementation, but pruning at a finer structure level, i.e., weight elements, usually leads to better network performance. We study the problem of weight element reactivation in this paper. Motivated by evolution, we select the unimportant filters and update their unimportant elements by combining them with the important elements of important filters, just like gene crossover to produce better offspring, and the proposed method is called weight evolution (WE). WE is mainly composed of four strategies. We propose a global selection strategy and a local selection strategy and combine them to locate the unimportant filters. A forward matching strategy is proposed to find the matched important filters and a crossover strategy is proposed to utilize the important elements of the important filters for updating unimportant filters. WE is plug-in to existing network architectures. Comprehensive experiments show that WE outperforms the other reactivation methods and plug-in training methods with typical convolutional neural networks, especially lightweight networks. Our code is available at https://github.com/BZQLin/Weight-evolution.

updated: Sat Oct 09 2021 07:33:11 GMT+0000 (UTC)

published: Sat Oct 09 2021 07:33:11 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト