Training Compact CNNs for Image Classification using Dynamic-coded Filter Fusion

Mingbao Lin; Rongrong Ji; Bohong Chen; Fei Chao; Jianzhuang Liu; Wei Zeng; Yonghong Tian; Qi Tian

動的コード化フィルター融合を使用した画像分類のためのコンパクトCNNのトレーニング

フィルタプルーニングの主流のアプローチは、通常、計算量の多い事前トレーニング済みモデルにハードコードされた重要度推定を強制して「重要な」フィルタを選択するか、ネットワークトレーニングを正規化するために損失目的にハイパーパラメータに敏感なスパース制約を課すことです。この論文では、効率的な画像分類のための計算経済的で正則化のない方法でコンパクトなCNNを導出するための、新しいフィルター剪定法である動的コード化フィルター融合（DCFF）を紹介します。 DCFFの各フィルターには、最初に、フィルタープロキシとして温度パラメーターを使用した類似性間分布が与えられます。さらに、フィルターの重要性を評価するために、新しいカルバックライブラー発散ベースの動的コード化基準が提案されます。他の方法で単にハイスコアフィルターを保持するのとは対照的に、フィルター融合の概念、つまり、保存されたフィルターとして割り当てられたプロキシを使用した加重平均を提案します。温度パラメータが無限大に近づくと、ワンホットの類似性間分布が得られます。したがって、各フィルターの相対的な重要性は、コンパクトCNNのトレーニングに伴って変化する可能性があり、事前トレーニングされたモデルへの依存とスパース制約の導入の両方なしに、動的に変更可能な融合フィルターにつながります。分類ベンチマークに関する広範な実験は、比較された対応物に対するDCFFの優位性を示しています。たとえば、DCFFは、CIFAR-10で93.47％のトップ1精度に到達しながら、72.77Mフロップと106Mパラメーターのみを備えたコンパクトなVGGNet-16を導出します。コンパクトなResNet-50は、63.8％のFLOPと58.6％のパラメーター削減で得られ、ILSVRC-2012で75.60％のトップ1精度を維持します。コード、より狭いモデル、トレーニングログは、https：//github.com/lmbxmu/DCFFで入手できます。

The mainstream approach for filter pruning is usually either to force a hard-coded importance estimation upon a computation-heavy pretrained model to select "important" filters, or to impose a hyperparameter-sensitive sparse constraint on the loss objective to regularize the network training. In this paper, we present a novel filter pruning method, dubbed dynamic-coded filter fusion (DCFF), to derive compact CNNs in a computation-economical and regularization-free manner for efficient image classification. Each filter in our DCFF is firstly given an inter-similarity distribution with a temperature parameter as a filter proxy, on top of which, a fresh Kullback-Leibler divergence based dynamic-coded criterion is proposed to evaluate the filter importance. In contrast to simply keeping high-score filters in other methods, we propose the concept of filter fusion, i.e., the weighted averages using the assigned proxies, as our preserved filters. We obtain a one-hot inter-similarity distribution as the temperature parameter approaches infinity. Thus, the relative importance of each filter can vary along with the training of the compact CNN, leading to dynamically changeable fused filters without both the dependency on the pretrained model and the introduction of sparse constraints. Extensive experiments on classification benchmarks demonstrate the superiority of our DCFF over the compared counterparts. For example, our DCFF derives a compact VGGNet-16 with only 72.77M FLOPs and 1.06M parameters while reaching top-1 accuracy of 93.47% on CIFAR-10. A compact ResNet-50 is obtained with 63.8% FLOPs and 58.6% parameter reductions, retaining 75.60% top-1 accuracy on ILSVRC-2012. Our code, narrower models and training logs are available at https://github.com/lmbxmu/DCFF.

updated: Wed Jul 14 2021 18:07:38 GMT+0000 (UTC)

published: Wed Jul 14 2021 18:07:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト