Mitigating severe over-parameterization in deep convolutional neural networks through forced feature abstraction and compression with an entropy-based heuristic

Nidhi Gowdra; Roopak Sinha; Stephen MacDonell; Wei Qi Yan

エントロピーベースのヒューリスティックを使用した強制的な特徴の抽象化と圧縮により、深い畳み込みニューラルネットワークでの深刻な過剰パラメーター化を軽減

ResNet-50、DenseNet-40、ResNeXt-56などの畳み込みニューラルネットワーク（CNN）は、パラメーターが大幅に過剰になっているため、モデルの深さの増加に応じて指数関数的にスケーリングするモデルトレーニングに必要な計算リソースを増やす必要があります。この論文では、ロバストでシンプルでありながら、CNNモデルのネットワーク深度に関する過剰パラメータ化の問題を解決するのに効果的なエントロピーベースの畳み込み層推定（EBCLE）ヒューリスティックを提案します。 EBCLEヒューリスティックは、入力データセットのエントロピーデータ分布の先験的な知識を使用して、畳み込みネットワークの深さの上限を決定します。これを超えると、ID変換が普及し、モデルのパフォーマンスを向上させるための重要な貢献が提供されなくなります。特徴の圧縮と抽象化を強制することによって深度の冗長性を制限すると、モデルのパフォーマンスを低下させることなく、トレーニング時間を24.99％〜78.59％削減しながら、過剰なパラメーター化を制限します。 EBCLEヒューリスティックを使用してトレーニングされた、より広く、より浅いモデルの相対的な有効性を強調するための経験的証拠を提示します。これは、より狭く、より深いモデルのベースライン分類精度を維持または上回ります。 EBCLEヒューリスティックはアーキテクチャにとらわれず、EBCLEベースのCNNモデルは深度の冗長性を制限し、利用可能な計算リソースの利用率を高めます。提案されたEBCLEヒューリスティックは、研究者がCNNのハイパーパラメータ（HP）の選択を分析的に正当化するための説得力のある手法です。 CNNモデルのトレーニングにおけるEBCLEヒューリスティックの経験的検証は、5つのベンチマークデータセット（ImageNet32、CIFAR-10 / 100、STL-10、MNIST）と4つのネットワークアーキテクチャ（DenseNet、ResNet、ResNeXt、EfficientNet B0-B2）で確立されました。このペーパーで提示された決定的な主張を推測するために採用されたテスト。

Convolutional Neural Networks (CNNs) such as ResNet-50, DenseNet-40 and ResNeXt-56 are severely over-parameterized, necessitating a consequent increase in the computational resources required for model training which scales exponentially for increments in model depth. In this paper, we propose an Entropy-Based Convolutional Layer Estimation (EBCLE) heuristic which is robust and simple, yet effective in resolving the problem of over-parameterization with regards to network depth of CNN model. The EBCLE heuristic employs a priori knowledge of the entropic data distribution of input datasets to determine an upper bound for convolutional network depth, beyond which identity transformations are prevalent offering insignificant contributions for enhancing model performance. Restricting depth redundancies by forcing feature compression and abstraction restricts over-parameterization while decreasing training time by 24.99% - 78.59% without degradation in model performance. We present empirical evidence to emphasize the relative effectiveness of broader, yet shallower models trained using the EBCLE heuristic, which maintains or outperforms baseline classification accuracies of narrower yet deeper models. The EBCLE heuristic is architecturally agnostic and EBCLE based CNN models restrict depth redundancies resulting in enhanced utilization of the available computational resources. The proposed EBCLE heuristic is a compelling technique for researchers to analytically justify their HyperParameter (HP) choices for CNNs. Empirical validation of the EBCLE heuristic in training CNN models was established on five benchmarking datasets (ImageNet32, CIFAR-10/100, STL-10, MNIST) and four network architectures (DenseNet, ResNet, ResNeXt and EfficientNet B0-B2) with appropriate statistical tests employed to infer any conclusive claims presented in this paper.

updated: Sun Jun 27 2021 10:34:39 GMT+0000 (UTC)

published: Sun Jun 27 2021 10:34:39 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト