Adaptively Customizing Activation Functions for Various Layers

Haigen Hu; Aizhu Liu; Qiu Guan; Xiaoxin Li; Shengyong Chen; Qianwei Zhou

さまざまな層の活性化関数を適応的にカスタマイズする

ニューラルネットワークの非線形性を強化し、入力変数と応答変数の間のマッピング能力を高めるために、活性化関数は、データ内のより複雑な関係とパターンをモデル化するために重要な役割を果たします。この作業では、Sigmoid、Tanh、ReLUなどの従来の活性化関数にごくわずかなパラメーターを追加するだけで、活性化関数を適応的にカスタマイズするための新しい方法論を提案します。提案された方法論の有効性を検証するために、収束の加速とパフォーマンスの向上に関する理論的および実験的分析が提示され、さまざまなネットワークモデル（AlexNet、VGGNet、GoogLeNet、ResNet、DenseNetなど）に基づいて一連の実験が行われます。、およびさまざまなデータセット（CIFAR10、CIFAR100、miniImageNet、PASCAL VOC、COCOなど）。さまざまな最適化戦略と使用シナリオでの妥当性と適合性をさらに検証するために、さまざまな最適化戦略（SGD、Momentum、AdaGrad、AdaDelta、ADAMなど）と分類や検出などのさまざまな認識タスクの間でいくつかの比較実験も実装されています。結果は、提案された方法論が非常に単純であるが、収束速度、精度、および一般化においてかなりのパフォーマンスを備えており、全体的なパフォーマンスの点で、ほとんどすべての実験でReLUなどの他の一般的な方法やSwishなどの適応関数を超えることができることを示しています。 https://github.com/HuHaigen/Adaptively-Customizing-Activation-Functionsで入手できます。パッケージには、再現性を目的とした提案された3つの適応活性化関数が含まれています。

To enhance the nonlinearity of neural networks and increase their mapping abilities between the inputs and response variables, activation functions play a crucial role to model more complex relationships and patterns in the data. In this work, a novel methodology is proposed to adaptively customize activation functions only by adding very few parameters to the traditional activation functions such as Sigmoid, Tanh, and ReLU. To verify the effectiveness of the proposed methodology, some theoretical and experimental analysis on accelerating the convergence and improving the performance is presented, and a series of experiments are conducted based on various network models (such as AlexNet, VGGNet, GoogLeNet, ResNet and DenseNet), and various datasets (such as CIFAR10, CIFAR100, miniImageNet, PASCAL VOC and COCO) . To further verify the validity and suitability in various optimization strategies and usage scenarios, some comparison experiments are also implemented among different optimization strategies (such as SGD, Momentum, AdaGrad, AdaDelta and ADAM) and different recognition tasks like classification and detection. The results show that the proposed methodology is very simple but with significant performance in convergence speed, precision and generalization, and it can surpass other popular methods like ReLU and adaptive functions like Swish in almost all experiments in terms of overall performance.The code is publicly available at https://github.com/HuHaigen/Adaptively-Customizing-Activation-Functions. The package includes the proposed three adaptive activation functions for reproducibility purposes.

updated: Fri Dec 17 2021 11:23:03 GMT+0000 (UTC)

published: Fri Dec 17 2021 11:23:03 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト