Speeding Up EfficientNet: Selecting Update Blocks of Convolutional Neural Networks using Genetic Algorithm in Transfer Learning

Md. Mehedi Hasana; Muhammad Ibrahim; Md. Sawkat Ali

EfficientNet の高速化: 転移学習における遺伝的アルゴリズムを使用した畳み込みニューラルネットワークの更新ブロックの選択

畳み込みニューラルネットワーク (CNN) のパフォーマンスは、そのアーキテクチャに大きく依存します。 CNN の転移学習のパフォーマンスは、トレーニング可能なレイヤーの選択に大きく依存します。特定のターゲットデータセットに対して最も効果的な更新レイヤーを選択するには、多くの実務家が持っていない CNN アーキテクチャに関する専門知識が必要になることがよくあります。一般ユーザーは、ドメインの専門家によって開発された利用可能なアーキテクチャ (GoogleNet、ResNet、EfficientNet など) を使用することを好みます。レイヤーの数が増え続ける中、更新レイヤーを厳選することはますます困難で面倒になっています。したがって、この論文では、この問題を軽減するための遺伝的アルゴリズムの適用を検討します。一般的な事前学習済みネットワークの畳み込み層は、多くの場合、ビルディングブロックを構成するモジュールにグループ化されます。パラメータを更新するために層のブロックを選択する遺伝的アルゴリズムを考案します。 ImageNet で事前トレーニングされた EfficientNetB0 を実験し、Food-101、CIFAR-100、および MangoLeafBD をターゲットデータセットとして使用することで、アルゴリズムが精度の点でベースラインと同等またはそれ以上の結果をもたらし、トレーニングと評価にかかる時間が短縮されることを示します。より少ない数のパラメーターを学習します。また、更新ブロックとして各ブロックの有効性を測定し、アルゴリズムによって選択されたブロックの重要性を分析するために、ブロックの重要性と呼ばれるメトリックを考案します。

The performance of convolutional neural networks (CNN) depends heavily on their architectures. Transfer learning performance of a CNN relies quite strongly on selection of its trainable layers. Selecting the most effective update layers for a certain target dataset often requires expert knowledge on CNN architecture which many practitioners do not posses. General users prefer to use an available architecture (e.g. GoogleNet, ResNet, EfficientNet etc.) that is developed by domain experts. With the ever-growing number of layers, it is increasingly becoming quite difficult and cumbersome to handpick the update layers. Therefore, in this paper we explore the application of genetic algorithm to mitigate this problem. The convolutional layers of popular pretrained networks are often grouped into modules that constitute their building blocks. We devise a genetic algorithm to select blocks of layers for updating the parameters. By experimenting with EfficientNetB0 pre-trained on ImageNet and using Food-101, CIFAR-100 and MangoLeafBD as target datasets, we show that our algorithm yields similar or better results than the baseline in terms of accuracy, and requires lower training and evaluation time due to learning less number of parameters. We also devise a metric called block importance to measure efficacy of each block as update block and analyze the importance of the blocks selected by our algorithm.

updated: Wed Mar 01 2023 06:35:29 GMT+0000 (UTC)

published: Wed Mar 01 2023 06:35:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト